Python Backport Compiler Utilities¶
Utility library for the Python bpc
backport compiler.
Currently, the three individual tools (f2format
, poseur
,
walrus
) depend on this repo. The bpc
compiler is a
work in progress.
Module contents¶
Utility library for the Python bpc backport compiler.
-
exception
bpc_utils.
BPCInternalError
(message, context)[source]¶ Bases:
RuntimeError
Internal bug happened in BPC tools.
Initialize BPCInternalError.
-
exception
bpc_utils.
BPCRecoveryError
[source]¶ Bases:
RuntimeError
Error during file recovery.
-
exception
bpc_utils.
BPCSyntaxError
[source]¶ Bases:
SyntaxError
Syntax error detected when parsing code.
-
class
bpc_utils.
BaseContext
(node, config, *, indent_level=0, raw=False)[source]¶ Bases:
abc.ABC
Abstract base class for general conversion context.
Initialize BaseContext.
- Parameters
node (parso.tree.NodeOrLeaf) – parso AST
config (
Config
) – conversion configurationsindent_level (
int
) – current indentation levelraw (
bool
) – raw processing flag
-
final
__iadd__
(code)[source]¶ Support of the
+=
operator.If
self._prefix_or_suffix
isTrue
, then thecode
will be appended toself._prefix
; else it will be appended toself._suffix
.- Parameters
code (
str
) – code string- Returns
self
- Return type
-
final
__str__
()[source]¶ Returns a stripped version of
self._buffer
.- Return type
-
final
_process
(node)[source]¶ Recursively process parso AST.
All processing methods for a specific
node
type are defined as_process_{type}
. This method first checks if such processing method exists. If so, it will call such method on thenode
; otherwise it will traverse through all children ofnode
, and perform the same logic on each child.- Parameters
node (parso.tree.NodeOrLeaf) – parso AST
- Return type
-
final
_walk
(node)[source]¶ Start traversing the AST module.
The method traverses through all children of
node
. It first checks if such child has the target expression. If so, it will toggleself._prefix_or_suffix
(set toFalse
) and save the last previous child asself._node_before_expr
. Then it processes the child withself._process
.- Parameters
node (parso.tree.NodeOrLeaf) – parso AST
- Return type
-
final static
extract_whitespaces
(code)[source]¶ Extract preceding and succeeding whitespaces from the code given.
-
abstract
has_expr
(node)[source]¶ Check if node has the target expression.
- Parameters
node (parso.tree.NodeOrLeaf) – parso AST
- Return type
- Returns
whether
node
has the target expression
-
final classmethod
mangle
(cls_name, var_name)[source]¶ Mangle variable names.
This method mangles variable names as described in Python documentation about mangling and further normalizes the mangled variable name through
normalize()
.
-
final static
missing_newlines
(prefix, suffix, expected, linesep)[source]¶ Count missing blank lines for code insertion given surrounding code.
-
final static
normalize
(name)[source]¶ Normalize variable names.
This method normalizes variable names as described in Python documentation about identifiers and PEP 3131.
-
final static
split_comments
(code, linesep)[source]¶ Separates prefixing comments from code.
This method separates prefixing comments and suffixing code. It is rather useful when inserting code might break shebang and encoding cookies (PEP 263), etc.
-
_node_before_expr
: Optional[parso.tree.NodeOrLeaf]¶ Preceding node with the target expression, i.e. the insertion point.
-
_prefix_or_suffix
: bool¶ Flag to indicate whether buffer is now
self._prefix
.
-
_root
: Final[parso.tree.NodeOrLeaf]¶ Root node given by the
node
parameter.
-
_uuid_gen
: Final[UUID4Generator]¶ UUID generator.
-
property
string
¶ Returns conversion buffer (
self._buffer
).- Return type
-
class
bpc_utils.
Config
(**kwargs)[source]¶ Bases:
MutableMapping
[str
,object
]Configuration namespace.
This class is inspired from
argparse.Namespace
for storing internal attributes and/or configuration variables.>>> config = Config(foo='var', bar=True) >>> config.foo 'var' >>> config['bar'] True >>> config.bar = 'boo' >>> del config['foo'] >>> config Config(bar='boo')
-
class
bpc_utils.
Placeholder
(name)[source]¶ Bases:
object
Placeholder for string interpolation.
Placeholder
objects can be concatenated withstr
, otherPlaceholder
objects andStringInterpolation
objects via the ‘+’ operator.Placeholder
objects should be regarded as immutable. Please do not modify thename
attribute. Build new objects instead.Initialize Placeholder.
-
class
bpc_utils.
StringInterpolation
(*args)[source]¶ Bases:
object
A string with placeholders to be filled in.
This looks like an object-oriented format string, but making sure that string literals are always interpreted literally (so no need to manually do escaping). The boundaries between string literals and placeholders are very clear. Filling in a placeholder will never inject a new placeholder, protecting string integrity for multiple-round interpolation.
>>> s1 = '%(injected)s' >>> s2 = 'hello' >>> s = StringInterpolation('prefix ', Placeholder('q1'), ' infix ', Placeholder('q2'), ' suffix') >>> str(s % {'q1': s1} % {'q2': s2}) 'prefix %(injected)s infix hello suffix'
(This can be regarded as an improved version of
string.Template.safe_substitute()
.)Multiple-round interpolation is tricky to do with a traditional format string. In order to do things correctly and avoid format string injection vulnerabilities, you need to perform escapes very carefully.
>>> fs = 'prefix %(q1)s infix %(q2)s suffix' >>> fs % {'q1': s1} % {'q2': s2} Traceback (most recent call last): ... KeyError: 'q2' >>> fs = 'prefix %(q1)s infix %%(q2)s suffix' >>> fs % {'q1': s1} % {'q2': s2} Traceback (most recent call last): ... KeyError: 'injected' >>> fs % {'q1': s1.replace('%', '%%')} % {'q2': s2} 'prefix %(injected)s infix hello suffix'
StringInterpolation
objects can be concatenated withstr
,Placeholder
objects and otherStringInterpolation
objects via the ‘+’ operator.StringInterpolation
objects should be regarded as immutable. Please do not modify theliterals
andplaceholders
attributes. Build new objects instead.Initialize StringInterpolation.
args
will be concatenated to construct aStringInterpolation
object.>>> StringInterpolation('prefix', Placeholder('data'), 'suffix') StringInterpolation('prefix', Placeholder('data'), 'suffix')
- Parameters
args (Union[str, Placeholder, StringInterpolation]) – the components to construct a
StringInterpolation
object
-
__mod__
(substitutions)[source]¶ Substitute the placeholders in this
StringInterpolation
object with string values (if possible) according to thesubstitutions
mapping.>>> StringInterpolation('prefix ', Placeholder('data'), ' suffix') % {'data': 'hello'} StringInterpolation('prefix hello suffix')
- Parameters
substitutions (Mapping[str, object]) – a mapping from placeholder names to the values to be filled in; all values are converted into
str
- Return type
- Returns
a new
StringInterpolation
object with as many placeholders substituted as possible
-
__str__
()[source]¶ Returns the fully-substituted string interpolation result.
>>> str(StringInterpolation('prefix hello suffix')) 'prefix hello suffix'
- Return type
- Returns
the fully-substituted string interpolation result
- Raises
ValueError – if there are still unsubstituted placeholders in this
StringInterpolation
object
-
static
from_components
(literals, placeholders)[source]¶ Construct a
StringInterpolation
object fromliterals
andplaceholders
components. This method is more efficient than theStringInterpolation()
constructor, but it is mainly intended for internal use.>>> StringInterpolation.from_components( ... ('prefix', 'infix', 'suffix'), ... (Placeholder('data1'), Placeholder('data2')) ... ) StringInterpolation('prefix', Placeholder('data1'), 'infix', Placeholder('data2'), 'suffix')
- Parameters
literals (Iterable[str]) – the literal components in order
placeholders (Iterable[Placeholder]) – the
Placeholder
components in order
- Return type
- Returns
the constructed
StringInterpolation
object- Raises
TypeError – if
literals
isstr
; ifliterals
contains non-str
values; ifplaceholders
contains non-Placeholder
valuesValueError – if the length of
literals
is not exactly one more than the length ofplaceholders
-
iter_components
()[source]¶ Generator to iterate all components of this
StringInterpolation
object in order.>>> list(StringInterpolation('prefix', Placeholder('data'), 'suffix').iter_components()) ['prefix', Placeholder('data'), 'suffix']
- Return type
Generator[Union[str, Placeholder], None, None]
- Returns
generator containing the components of this
StringInterpolation
object in order
-
class
bpc_utils.
UUID4Generator
(dash=True)[source]¶ Bases:
object
UUID 4 generator wrapper to prevent UUID collisions.
Constructor of UUID 4 generator wrapper.
- Parameters
dash (
bool
) – whether the generated UUID string has dashes or not
-
bpc_utils.
TaskLock
()[source]¶ Function that returns a lock for possibly concurrent tasks.
- Return type
ContextManager[None]
- Returns
a lock for possibly concurrent tasks
-
bpc_utils.
detect_encoding
(code)[source]¶ Detect encoding of Python source code as specified in PEP 263.
- Parameters
code (
bytes
) – the code to detect encoding- Return type
- Returns
the detected encoding, or the default encoding (
utf-8
)- Raises
SyntaxError – if both a BOM and a cookie are present, but disagree
-
bpc_utils.
detect_files
(files)[source]¶ Get a list of Python files to be processed according to user input.
This will perform glob expansion on Windows, make all paths absolute, resolve symbolic links and remove duplicates.
- Parameters
files (Iterable[str]) – a list of files and directories to process (usually provided by users on command-line)
- Return type
List[str]
- Returns
a list of Python files to be processed
See also
See
expand_glob_iter()
for more information.
-
bpc_utils.
detect_indentation
(code)[source]¶ Detect indentation of Python source code.
- Parameters
code (Union[str, bytes, TextIO, parso.tree.NodeOrLeaf]) – the code to detect indentation
- Return type
- Returns
the detected indentation sequence
- Raises
TokenError – when failed to tokenize the source code under certain cases, see documentation of
TokenError
for more details
Notes
In case of mixed indentation, try voting by the number of occurrences of each indentation value (spaces and tabs).
When there is a tie between spaces and tabs, prefer 4 spaces for PEP 8.
-
bpc_utils.
detect_linesep
(code)[source]¶ Detect linesep of Python source code.
- Parameters
code (Union[str, bytes, TextIO, parso.tree.NodeOrLeaf]) – the code to detect linesep
- Returns
the detected linesep (one of
'\n'
,'\r\n'
and'\r'
)- Return type
Notes
In case of mixed linesep, try voting by the number of occurrences of each linesep value.
When there is a tie, prefer
LF
toCRLF
, preferCRLF
toCR
.
-
bpc_utils.
first_non_none
(*args)[source]¶ Return the first non-
None
value from a list of values.- Parameters
*args –
variable length argument list
If one positional argument is provided, it should be an iterable of the values.
If two or more positional arguments are provided, then the value list is the positional argument list.
- Returns
the first non-
None
value, if all values areNone
or sequence is empty, returnNone
- Raises
TypeError – if no arguments provided
-
bpc_utils.
first_truthy
(*args)[source]¶ Return the first truthy value from a list of values.
- Parameters
*args –
variable length argument list
If one positional argument is provided, it should be an iterable of the values.
If two or more positional arguments are provided, then the value list is the positional argument list.
- Returns
the first truthy value, if no truthy values found or sequence is empty, return
None
- Raises
TypeError – if no arguments provided
-
bpc_utils.
get_parso_grammar_versions
(minimum=None)[source]¶ Get Python versions that parso supports to parse grammar.
- Parameters
minimum (Optional[str]) – filter result by this minimum version
- Return type
List[str]
- Returns
a list of Python versions that parso supports to parse grammar
- Raises
ValueError – if
minimum
is invalid
-
bpc_utils.
map_tasks
(func, iterable, posargs=None, kwargs=None, *, processes=None, chunksize=None)[source]¶ Execute tasks in parallel if
multiprocessing
is available, otherwise execute them sequentially.- Parameters
func (Callable[.., T]) – the task function to execute
iterable (Iterable[object]) – the items to process
posargs (Optional[Iterable[object]]) – additional positional arguments to pass to
func
kwargs (Optional[Mapping[str, object]]) – keyword arguments to pass to
func
processes (Optional[int]) – the number of worker processes (default: auto determine)
chunksize (Optional[int]) – chunk size for multiprocessing
- Return type
List[T]
- Returns
the return values of the task function applied on the input items and additional arguments
-
bpc_utils.
parse_boolean_state
(s)[source]¶ Parse a boolean state from a string representation.
These values are regarded as
True
:'1'
,'yes'
,'y'
,'true'
,'on'
These values are regarded as
False
:'0'
,'no'
,'n'
,'false'
,'off'
Value matching is case insensitive.
- Parameters
s (Optional[str]) – string representation of a boolean state
- Return type
Optional[bool]
- Returns
- Raises
ValueError – if
s
is an invalid boolean state value
See also
See
_boolean_state_lookup
for default lookup mapping values.
-
bpc_utils.
parse_indentation
(s)[source]¶ Parse indentation from a string representation.
If an integer or a string of positive integer
n
is specified, then indentation isn
spaces.If
't'
or'tab'
is specified, then indentation is tab.If
'\t'
(the tab character itself) or a string consisting only of the space character (U+0020) is specified, it is returned directly.
Value matching is case insensitive.
-
bpc_utils.
parse_linesep
(s)[source]¶ Parse linesep from a string representation.
These values are regarded as
'\n'
:'\n'
,'lf'
These values are regarded as
'\r\n'
:'\r\n'
,'crlf'
These values are regarded as
'\r'
:'\r'
,'cr'
Value matching is case insensitive.
- Parameters
s (Optional[str]) – string representation of linesep
- Returns
the parsed linesep result, return
None
if input isNone
or empty string- Return type
Optional[
Linesep
]- Raises
ValueError – if
s
is an invalid linesep value
See also
See
_linesep_lookup
for default lookup mapping values.
-
bpc_utils.
parse_positive_integer
(s)[source]¶ Parse a positive integer from a string representation.
-
bpc_utils.
parso_parse
(code, filename=None, *, version=None)[source]¶ Parse Python source code with parso.
- Parameters
- Return type
- Returns
parso AST
- Raises
BPCSyntaxError – when source code contains syntax errors
-
bpc_utils.
recover_files
(archive_file_or_dir, *, rr=False, rs=False)[source]¶ Recover files from a tar archive, optionally removing the archive file and archive directory after recovery.
This function supports three modes:
- Normal mode (when
rr
andrs
are bothFalse
): Recover from the archive file specified by
archive_file_or_dir
.
- Normal mode (when
- Recover and remove (when
rr
isTrue
): Recover from the archive file specified by
archive_file_or_dir
, and remove this archive file after recovery.
- Recover and remove (when
- Recover from the only file in the archive directory (when
rs
isTrue
): If the directory specified by
archive_file_or_dir
contains exactly one (regular) file, recover from that file and remove the archive directory.
- Recover from the only file in the archive directory (when
Specifying both
rr
andrs
asTrue
is not accepted.- Parameters
- Raises
ValueError – when
rr
andrs
are bothTrue
BPCRecoveryError – when
rs
isTrue
, and the directory specified byarchive_file_or_dir
is empty, contains more than one item, or contains a non-regular file
- Return type
-
bpc_utils.
Linesep
¶ Type alias for
Literal['\n', '\r\n', '\r']
.
Internal utilities¶
-
bpc_utils.argparse.
_boolean_state_lookup
¶ -
A mapping from string representation to boolean states. The values are used for
parse_boolean_state()
.
-
bpc_utils.argparse.
_linesep_lookup
¶ - Type
Final[Dict[str,
Linesep
]]
A mapping from string representation to linesep. The values are used for
parse_linesep()
.
-
bpc_utils.fileprocessing.
LOOKUP_TABLE
: Final[str] = '_lookup_table.json'¶ File name for the lookup table in the archive file.
- Type
Final[str]
-
bpc_utils.fileprocessing.
is_python_filename
(filename)[source]¶ Determine whether a file is a Python source file by its extension.
-
bpc_utils.fileprocessing.
expand_glob_iter
(pattern)[source]¶ Wrapper function to perform glob expansion.
-
class
bpc_utils.logging.
BPCLogHandler
[source]¶ Bases:
logging.StreamHandler
Handler used to format BPC logging records.
Initialize BPCLogHandler.
-
format
(record)[source]¶ Format the specified record based on log level.
The record will be formatted based on its log level in the following flavour:
DEBUG
[%(levelname)s] %(asctime)s %(message)s
INFO
%(message)s
WARNING
Warning: %(message)s
ERROR
Error: %(message)s
CRITICAL
Error: %(message)s
-
format_templates
: Dict[str, str] = {'CRITICAL': 'Error: %(message)s', 'DEBUG': '[%(levelname)s] %(asctime)s %(message)s', 'ERROR': 'Error: %(message)s', 'INFO': '%(message)s', 'WARNING': 'Warning: %(message)s'}¶
-
time_format
= '%Y-%m-%d %H:%M:%S.%f%z'¶
-
-
bpc_utils.misc.
current_time_with_tzinfo
()[source]¶ Get the current time with local time zone information.
- Return type
- Returns
datetime object representing current time with local time zone information
-
class
bpc_utils.misc.
MakeTextIO
(obj)[source]¶ Bases:
object
Context wrapper class to handle
str
and file objects together.- Variables
Initialize context.
- Parameters
obj (Union[str, TextIO]) – the object to manage in the context
-
bpc_utils.multiprocessing.
mp
¶ - Type
Optional[ModuleType]
- Value
<module ‘multiprocessing’>
An alias of the Python builtin
multiprocessing
module if available.
-
bpc_utils.multiprocessing.
_mp_map_wrapper
(args)[source]¶ Map wrapper function for
multiprocessing
.
-
bpc_utils.multiprocessing.
_mp_init_lock
(lock)[source]¶ Initialize lock for
multiprocessing
.