aboutsummaryrefslogtreecommitdiff
path: root/py/objstr.c
AgeCommit message (Collapse)Author
2022-05-03all: Use mp_obj_malloc everywhere it's applicable.Jim Mussared
This replaces occurences of foo_t *foo = m_new_obj(foo_t); foo->base.type = &foo_type; with foo_t *foo = mp_obj_malloc(foo_t, &foo_type); Excludes any places where base is a sub-field or when new0/memset is used. Signed-off-by: Jim Mussared <jim.mussared@gmail.com>
2022-01-19py/objstr: Support '{:08}'.format("Jan") like Python 3.10.Jeff Epler
The new test has an .exp file, because it is not compatible with Python 3.9 and lower. See CPython version of the issue at https://bugs.python.org/issue27772 Signed-off-by: Jeff Epler <jepler@gmail.com>
2021-07-15py: Introduce and use mp_raise_type_arg helper.Damien George
To reduce code size. Signed-off-by: Damien George <damien@micropython.org>
2021-04-27py: Add option to compile without any error messages at all.Damien George
This introduces a new option, MICROPY_ERROR_REPORTING_NONE, which completely disables all error messages. To be used in cases where MicroPython needs to fit in very limited systems. Signed-off-by: Damien George <damien@micropython.org>
2020-12-07py/mpprint: Fix length calculation for strings with precision-modifier.Joris Peeraer
Two issues are tackled: 1. The calculation of the correct length to print is fixed to treat the precision as a maximum length instead as the exact length. This is done for both qstr (%q) and for regular str (%s). 2. Fix the incorrect use of mp_printf("%.*s") to mp_print_strn(). Because of the fix of above issue, some testcases that would print an embedded null-byte (^@ in test-output) would now fail. The bug here is that "%s" was used to print null-bytes. Instead, mp_print_strn is used to make sure all bytes are outputted and the exact length is respected. Test-cases are added for both %s and %q with a combination of precision and padding specifiers.
2020-09-24py/objstr: Make bytes(bytes_obj) return bytes_obj.Iyassou Shimels
Calling the bytes constructor on a bytes object returns the original bytes object. This saves allocating a new instance, and matches CPython. Signed-off-by: Iyassou Shimels <s.iyassou@gmail.com>
2020-04-23all: Format code to add space after C++-style comment start.stijn
Note: the uncrustify configuration is explicitly set to 'add' instead of 'force' in order not to alter the comments which use extra spaces after // as a means of indenting text for clarity.
2020-04-05all: Use MP_ERROR_TEXT for all error messages.Jim Mussared
2020-04-05py: Use preprocessor to detect error reporting level (terse/detailed).Jim Mussared
Instead of compiler-level if-logic. This is necessary to know what error strings are included in the build at the preprocessor stage, so that string compression can be implemented.
2020-03-11py/objstr: Remove duplicate % in error string.Tom Collins
The double-% was added in 11de8399fe5f9ef54589b14470faf8d4fcc5ccaa (Jun 2014) when such errors were formatted with printf. But then 55830dd9bf4fee87c0a6d3f38c51614fea0eb483 (Dec 2018) changed mp_obj_new_exception_msg() to not format the message, as discussed in #3004. So such error strings are no longer formatted and a % is just that.
2020-02-28all: Reformat C and Python source code with tools/codeformat.py.Damien George
This is run with uncrustify 0.70.1, and black 19.10b0.
2020-02-13py: Add mp_raise_msg_varg helper and use it where appropriate.Damien George
This commit adds mp_raise_msg_varg(type, fmt, ...) as a helper for nlr_raise(mp_obj_new_exception_msg_varg(type, fmt, ...)). It makes the C-level API for raising exceptions more consistent, and reduces code size on most ports: bare-arm: +28 +0.042% minimal x86: +100 +0.067% unix x64: -56 -0.011% unix nanbox: -300 -0.068% stm32: -204 -0.054% PYBV10 cc3200: +0 +0.000% esp8266: -64 -0.010% GENERIC esp32: -104 -0.007% GENERIC nrf: -136 -0.094% pca10040 samd: +0 +0.000% ADAFRUIT_ITSYBITSY_M4_EXPRESS
2020-01-24py/obj.h: Add and use mp_obj_is_bool() helper.Yonatan Goldschmidt
Commit d96cfd13e3a464862cecffb2718c6286b52c77b0 introduced a regression in testing for bool objects, that such objects were in some cases no longer recognised and bools, eg when using mp_obj_is_type(o, &mp_type_bool), or mp_obj_is_integer(o). This commit fixes that problem by adding mp_obj_is_bool(o). Builds with MICROPY_OBJ_IMMEDIATE_OBJS enabled check if the object is any of the const True or False objects. Builds without it use the old method of ->type checking, which compiles to smaller code (compared with the former mentioned method). Fixes #5538.
2020-01-09py: Make mp_obj_get_type() return a const ptr to mp_obj_type_t.Damien George
Most types are in rodata/ROM, and mp_obj_base_t.type is a constant pointer, so enforce this const-ness throughout the code base. If a type ever needs to be modified (eg a user type) then a simple cast can be used.
2019-12-27py/objstr: Don't use inline GET_STR_DATA_LEN for object-repr D.Damien George
Changing to use the helper function mp_obj_str_get_data_no_check() reduces code size of nan-boxing builds by about 1000 bytes.
2019-10-22py/objstr: Size-optimise failure path for mp_obj_str_get_buffer.Jim Mussared
These fields are never looked at if the function returns non-zero.
2019-09-26py: Rename MP_QSTR_NULL to MP_QSTRnull to avoid intern collisions.Josh Lloyd
Fixes #5140.
2019-02-12py: Downcase all MP_OBJ_IS_xxx macros to make a more consistent C API.Damien George
These macros could in principle be (inline) functions so it makes sense to have them lower case, to match the other C API functions. The remaining macros that are upper case are: - MP_OBJ_TO_PTR, MP_OBJ_FROM_PTR - MP_OBJ_NEW_SMALL_INT, MP_OBJ_SMALL_INT_VALUE - MP_OBJ_NEW_QSTR, MP_OBJ_QSTR_VALUE - MP_OBJ_FUN_MAKE_SIG - MP_DECLARE_CONST_xxx - MP_DEFINE_CONST_xxx These must remain macros because they are used when defining const data (at least, MP_OBJ_NEW_SMALL_INT is so it makes sense to have MP_OBJ_SMALL_INT_VALUE also a macro). For those macros that have been made lower case, compatibility macros are provided for the old names so that users do not need to change their code immediately.
2019-02-06py: Update my copyright info on some files.Paul Sokolovsky
Based on git history.
2018-10-22py/objstr: Make str.count() method configurable.Paul Sokolovsky
Configurable via MICROPY_PY_BUILTINS_STR_COUNT. Default is enabled. Disabled for bare-arm, minimal, unix-minimal and zephyr ports. Disabling it saves 408 bytes on x86.
2018-09-26py/objstr: format: Return bytes result for bytes format string.Paul Sokolovsky
This is an improvement over previous behavior when str was returned for both str and bytes input format. This new behaviour is also consistent with how the % operator works, as well as many other str/bytes methods. It should be noted that it's not how current versions of CPython work, where there's a gap in the functionality and bytes.format() is not supported.
2018-09-20py/objstr: Make % (__mod__) formatting operator configurable.Paul Sokolovsky
Default is enabled, disabled for minimal builds. Saves 1296 bytes on x86, 976 bytes on ARM.
2018-09-20py: Shorten error messages by using contractions and some rewording.Damien George
2018-07-30py/objstr: In format error message, use common string with %s for type.Damien George
This error message did not consume all of its variable args, a bug introduced long ago in baf6f14deb567ab626c1b05213af346108f41700. By fixing it to use %s (instead of keeping the string as-is and deleting the last arg) the same error message string is now reused three times in this format function and gives a code size reduction of around 130 bytes. It also now gives a better error message when a non-string is passed in as an argument to format, eg '{:d}'.format([]).
2018-04-05py/objstr: In find/rfind, don't crash when end < start.Jeff Epler
2018-03-30py/runtime: Check that keys in dicts passed as ** args are strings.Damien George
Prior to this patch the code would crash if a key in a ** dict was anything other than a str or qstr. This is because mp_setup_code_state() assumes that keys in kwargs are qstrs (for efficiency). Thanks to @jepler for finding the bug.
2018-02-20py/objstr: Remove unnecessary check for positive splits variable.Damien George
At this point in the code the variable "splits" is guaranteed to be positive due to the check for "splits == 0" above it.
2018-02-19py/objstr: Protect against creating bytes(n) with n negative.Damien George
Prior to this patch uPy (on a 32-bit arch) would have severe issues when calling bytes(-1): such a call would call vstr_init_len(vstr, -1) which would then +1 on the len and call vstr_init(vstr, 0), which would then round this up and allocate a small amount of memory for the vstr. The bytes constructor would then attempt to zero out all this memory, thinking it had allocated 2^32-1 bytes.
2018-02-14py/unicode: Clean up utf8 funcs and provide non-utf8 inline versions.Damien George
This patch provides inline versions of the utf8 helper functions for the case when unicode is disabled (MICROPY_PY_BUILTINS_STR_UNICODE set to 0). This saves code size. The unichar_charlen function is also renamed to utf8_charlen to match the other utf8 helper functions, and the signature of this function is adjusted for consistency (const char* -> const byte*, mp_uint_t -> size_t).
2017-11-29py: Annotate func defs with NORETURN when their corresp decls have it.Damien George
2017-11-24py/runtime: Add MP_BINARY_OP_CONTAINS as reverse of MP_BINARY_OP_IN.Damien George
Before this patch MP_BINARY_OP_IN had two meanings: coming from bytecode it meant that the args needed to be swapped, but coming from within the runtime meant that the args were already in the correct order. This lead to some confusion in the code and comments stating how args were reversed. It also lead to 2 bugs: 1) containment for a subclass of a native type didn't work; 2) the expression "{True} in True" would illegally succeed and return True. In both of these cases it was because the args to MP_BINARY_OP_IN ended up being reversed twice. To fix these things this patch introduces MP_BINARY_OP_CONTAINS which corresponds exactly to the __contains__ special method, and this is the operator that built-in types should implement. MP_BINARY_OP_IN is now only emitted by the compiler and is converted to MP_BINARY_OP_CONTAINS by swapping the arguments.
2017-11-16py/objstr: When constructing str from bytes, check for existing qstr.Damien George
This patch uses existing qstr data where possible when constructing a str from a bytes object.
2017-11-16py/objstr: Make mp_obj_new_str_of_type check for existing interned qstr.Damien George
The function mp_obj_new_str_of_type is a general str object constructor used in many places in the code to create either a str or bytes object. When creating a str it should first check if the string data already exists as an interned qstr, and if so then return the qstr object. This patch makes the function have such behaviour, which helps to reduce heap usage by reusing existing interned data where possible. The old behaviour of mp_obj_new_str_of_type (which didn't check for existing interned data) is made available through the function mp_obj_new_str_copy, but should only be used in very special cases. One consequence of this patch is that the following expression is now True: 'abc' is ' abc '.split()[0]
2017-11-16py/objstr: Remove "make_qstr_if_not_already" arg from mp_obj_new_str.Damien George
This patch simplifies the str creation API to favour the common case of creating a str object that is not forced to be interned. To force interning of a new str the new mp_obj_new_str_via_qstr function is added, and should only be used if warranted. Apart from simplifying the mp_obj_new_str function (and making it have the same signature as mp_obj_new_bytes), this patch also reduces code size by a bit (-16 bytes for bare-arm and roughly -40 bytes on the bare-metal archs).
2017-10-04py/objstr: Make empty bytes object have a null-terminating byte.Damien George
Because a lot of string processing functions assume there is a null terminating byte, so they can work in an efficient way. Fixes issue #3334.
2017-10-04all: Remove inclusion of internal py header files.Damien George
Header files that are considered internal to the py core and should not normally be included directly are: py/nlr.h - internal nlr configuration and declarations py/bc0.h - contains bytecode macro definitions py/runtime0.h - contains basic runtime enums Instead, the top-level header files to include are one of: py/obj.h - includes runtime0.h and defines everything to use the mp_obj_t type py/runtime.h - includes mpstate.h and hence nlr.h, obj.h, runtime0.h, and defines everything to use the general runtime support functions Additional, specific headers (eg py/objlist.h) can be included if needed.
2017-09-19py/objstr: strip: Don't strip "\0" by default.Paul Sokolovsky
An issue was due to incorrectly taking size of default strip characters set.
2017-09-06py/objstr: Add check for valid UTF-8 when making a str from bytes.tll
This patch adds a function utf8_check() to check for a valid UTF-8 encoded string, and calls it when constructing a str from raw bytes. The feature is selectable at compile time via MICROPY_PY_BUILTINS_STR_UNICODE_CHECK and is enabled if unicode is enabled. It costs about 110 bytes on Thumb-2, 150 bytes on Xtensa and 170 bytes on x86-64.
2017-08-29all: Convert mp_uint_t to mp_unary_op_t/mp_binary_op_t where appropriateDamien George
The unary-op/binary-op enums are already defined, and there are no arithmetic tricks used with these types, so it makes sense to use the correct enum type for arguments that take these values. It also reduces code size quite a bit for nan-boxing builds.
2017-08-29py/objstr: startswith, endswith: Check arg to be a string.Paul Sokolovsky
Otherwise, it will silently get incorrect result on other values types, including CPython tuple form like "foo.png".endswith(("png", "jpg")) (which MicroPython doesn't support for unbloatedness).
2017-08-13all: Raise exceptions via mp_raise_XXXJavier Candeira
- Changed: ValueError, TypeError, NotImplementedError - OSError invocations unchanged, because the corresponding utility function takes ints, not strings like the long form invocation. - OverflowError, IndexError and RuntimeError etc. not changed for now until we decide whether to add new utility functions.
2017-08-09py/objstr: Raise an exception for wrong type on RHS of str binary op.Damien George
The main case to catch is invalid types for the containment operator, of the form str.__contains__(non-str).
2017-07-31all: Use the name MicroPython consistently in commentsAlexander Steffen
There were several different spellings of MicroPython present in comments, when there should be only one.
2017-07-04py/objstr: Remove unnecessary "sign" variable in formatting code.Damien George
2017-07-02py/objstr: Move uPy function wrappers to just after the C function.Damien George
This matches the coding/layout style of all the other objects.
2017-06-08py/objstr: Allow to compile with obj-repr D, and unicode disabled.Damien George
2017-06-02py/objstr: Catch case of negative "maxsplit" arg to str.rsplit().Damien George
Negative values mean no limit on the number of splits so should delegate to the .split() method.
2017-05-29various: Spelling fixesVille Skyttä
2017-04-02py/objstr: Use MICROPY_FULL_CHECKS for range checking when constructing bytes.Paul Sokolovsky
Split this setting from MICROPY_CPYTHON_COMPAT. The idea is to be able to keep MICROPY_CPYTHON_COMPAT disabled, but still pass more of regression testsuite. In particular, this fixes last failing test in basics/ for Zephyr port.
2017-03-29py: Change mp_uint_t to size_t for mp_obj_str_get_data len arg.Damien George