user6774416 user6774416 - 25 days ago 8
Python Question

How big can the input to input() be?

How large can the input text I supply to the

input()
function possibly be?

Unfortunately there was no easy way to test it; after using a lot of copy-pasting, I couldn't get
input
to fail on any input I supplied. (and I eventually gave up after a bit.)

The documentation for the
input
function doesn't mention anything regarding this:


If the
prompt
argument is present, it is written to standard output without a trailing newline. The function then reads a line from input, converts it to a string (stripping a trailing newline), and returns that. When EOF is read,
EOFError
is raised.


So, I'm guessing there is no limit?? Does anyone know if there is? If so how much is it?

Answer

Of course there is, it can't be limitless*. The key sentence from the documentation that I believe needs highlighting is:

[...] reads a line from input, converts it to a string [...]

(emphasis mine)

Since it converts the input you supply into a Python str object, it essentially translates to: "Its size has to be less than or equal to the largest string you can create".

The reason why no explicit size is given is probably because this is an implementation detail. Enforcing a maximum size to all other implementations of Python wouldn't make much sense.

*In CPython, at least, the largest size of a string is bounded by how big its index is allowed to be (see PEP 353). That is, how big the number in the brackets [] is when you try and index it:

>>> s = ''
>>> s[2 ** 63]

IndexErrorTraceback (most recent call last)
<ipython-input-10-75e9ac36da20> in <module>()
----> 1 s[2 ** 63]

IndexError: cannot fit 'int' into an index-sized integer

(try the previous with 2 ** 63 - 1, that's the positive acceptable limit, -2 ** 63 is the negative limit.)

For indices, it isn't Python numbers that are internally used; instead, it is a Py_ssize_t which is a signed 32/64 bit int on 32/64 bit machines respectively. So, that's the hard limit from what it seems.

(as the error message states, int and intex-sized integer are two different things)

It also seems like input() explicitly checks if the input supplied is larger than PY_SSIZE_T_MAX (the maximum size of Py_ssize_t) before converting:

if (len > PY_SSIZE_T_MAX) {
    PyErr_SetString(PyExc_OverflowError,
                    "input: input too long");
    result = NULL;
}

Then it converts the input to a Python str with PyUnicode_Decode.


To put that in perspective for you, if the average book is 500.000 characters long, you could in essence input around:

>>> ((2 ** 63) - 1) // 500000
18446744073709

books; it would probably take you some time, though. (and you'd be limited by available memory first.)