Suppose you are working on a project in Python that takes advantage of Python's support for higher-order functions. Perhaps you are allowing users to specify their own hooks or event handlers within a web application framework, or you are creating a unit testing library that generates test cases for a user-supplied function. In many such use cases, you might find yourself dealing with two related scenarios:
The first of these cases is more straightforward to handle at runtime: inspect the type of the output and branch accordingly. However, in the event that you need to determine this information in advance, you may not always be free to invoke the function (e.g., if the function relies on some resource that is not yet available or running the function has a very high cost). The second is more evidently difficult, and you would need to provide for the user a way to specify the input type. How can you organize your API to handle both of these issues in a way that keeps your API clean and allows users of your API to leverage features already built into Python?
While there exist within the Python community standards for documenting information about the inputs and outputs of functions and methods within docstrings (such as those in the Google Python Style Guide), it is explicitly recommended that the information inside docstrings should not consist of a signature. This is in contrast with documentation conventions maintained by communities for other programming languages (such as JSDoc).
So how should programmers document the exact type signature of a function if they wish to do so? Starting with Python 3.5, the Python syntax was extended with mature and well-developed features that allow for type annotations: the ability to specify the types of variables and functions at the time they are defined. The documentation calls these type hints because it is a purely syntactic feature. Type annotations are not checked statically (i.e., at the time the code is parsed and transformed into bytecode) or dynamically (i.e., when the code is actually running).
The cosmetic quality of this built-in feature does not limit its utility, however. In addition to providing a more standard and formally endorsed representation for information that might otherwise be relegated to documentation strings or formatted comments, the Python community is free to write its own static and dynamic analysis tools that use the native syntax for type annotations.
More information on the built-in type specification library and the type annotation concrete syntax can be found in its documentation page. The corresponding new additions to the Python abstract syntax (a topic covered in more detail in another article) are actually quite few in number. A comparison of the Python 2.7 grammar and the Python 3.6 grammar shows that new parameters appear in only a few places:
annotation
parameter in the arg
parameter of the FunctionDef
and AsyncFunctionDef
cases,returns
parameter in the FunctionDef
and AsyncFunctionDef
cases, andannotation
parameter of the AnnAssign
case.All that is really happening here is that the Python syntax now has a few extra contexts, delimited using the tokens :
and ->
, in which programmers can add expressions that will make it into the abstract syntax tree as type annotations.
By convention, type annotations that refer to built-in types simply use the type constructors (e.g., int
, float
, and str
). The example below demonstrates how type annotations in can be included in assignment statements.
n: int = 123
s: str = "abc"
The example below demonstrates how type annotations can be included in function definitions.
def f(x: int) -> int:
return x + x
Note that no static or dynamic type checking takes place; the annotations are ignored by the interpreter.
(f(123), f("abc"))
The built-in typing library provides a number of useful functions for building up more complex and also user-defined types. The example below illustrates how the Tuple
constructor can be used to specify a tuple type. Note the use of overloading to repurpose the bracket notation that is usually used for indexing.
from typing import Tuple
def repeat(si: Tuple[str, int]) -> str:
(s, i) = si
return s * i
It is also possible to specify user-defined types.
from typing import NewType
UserName = NewType("UserName", str)
def confirm(s: UserName) -> bool:
return s == "Alice"
As before, note that no type checking occurs.
(confirm("Alice"), confirm("Bob"), confirm(123))
It is also possible to introduce type variables. This is particularly useful for specifying types for functions that are examples of parametric polymorphism. In the example below, the function is an example of parametric polymorphism in that it can operate on any list, regardless of the types of the items in that list.
from typing import Sequence, TypeVar
T = TypeVar("T")
def first(xs: Sequence[T]) -> T:
return xs[0]
This annotation would be an indication from whoever implemented it that the function first
can be applied to a list of any type as long as all the elements in that list are of the same type.
(first([1,2,3]), first(["a", "b", "c"]))
The annotation in the above example is different from the annotation below, which indicates that the types of the items in the input list can be mixed (e.g., [123, "abc"]
) as long as they are each either an integer or a string.
from typing import Sequence, Union
def first(xs: Sequence[Union[int, str]]) -> Union[int, str]:
return xs[0]
To return to the motivating example introduced in the first paragraph of this article, suppose you are creating a unit testing framework that generates random inputs for functions in order to check that (1) they always return an output of the specified type for every input and (2) they do not raise any exceptions. You can allow users of your library to specify the input types of the functions they are trying to test via Python type annotations. This ensures you are not reinventing the wheel and that your users are not cluttering their code more than necessary with decorators or other additional information that is useful only for your framework and nothing else.
There are two distinct ways to extract the type annotations associated with a function. One approach (using concepts and techniques covered in detail in another article) is to inspect the source code of the function, parse it into an abstract syntax tree, and then extract the annotations from that abstract syntax tree.
import inspect
import ast
def signature(f):
# Parse the function and extract types from the AST.
a = ast.parse(inspect.getsource(f))
type_in = a.body[0].args.args[0].annotation.id
type_out = a.body[0].returns.id
return (type_in, type_out)
One benefit of this approach is that you are extracting the original text found in the definition as a string (rather than the value or object to which it evaluates).
Number = int
def double(x: Number) -> Number:
return x + x
signature(double)
Another approach is to use the __annotations__
attribute of a function.
double.__annotations__
Note that because return
is a reserved word in the Python concrete syntax, it is safe for it to appear as a key in a dictionary in which all other keys are names of input parameters. The variant of signature
below assumes that there is only one input parameter in the function definition.
def signature(f):
a = f.__annotations__
type_in = [a[k] for k in a if k != 'return'][0]
type_out = a['return']
return (type_in, type_out)
When using this approach, you receive the evaluated result of the expression that appeared within the annotation context. If the original synonym used for a type (as in the first example above) is important to obtain for your application or scenario, that information may be lost after evaluation. If all you care about is the actual type and not the name of the user-defined synonym, this approach is a more direct way to obtain the annotation information.
signature(double)
One important point to consider once you have chosen one of the two techniques above is how you might check whether an object or value is of the type you obtained from the annotation. If the type annotation information is in the form of a string and you are checking a value of a built-in type or an object of a user-defined class, you can perform the check by extracting the name of the type and doing a string comparison.
type(123).__name__ == "int"
On the other hand, if the type annotation information is the form of a value or object that represents a type (such as int
), you can check a value is of the type in the following way.
isinstance(123, int)
A concrete example of a component in your unit testing framework is presented below. It can ingest a function that takes a single input and produces a single output. This component will generate random inputs of the appropriate type, depending on whether the input type annotation of the supplied function indicates that the input must be an integer or a floating point number. It will then check that the output type matches the function's output type annotation and that no exceptions are raised.
import random
def safe(f):
(type_in, type_out) = signature(f)
for i in range(10000): # Run 100 trials.
# Generate a random input of the appropriate type.
if type_in is int:
value_in = random.randint(-2**16, (2**16)-1)
if type_in is float:
value_in = random.uniform(-2**16, (2**16)-1)
# Check that output has the correct type.
try:
value_out = f(value_in)
assert(isinstance(value_out, type_out))
except:
return False
return True # All trials succeeded.
The safe
function is applied to some example inputs below. The triple
function correctly returns an integer in all cases. However, the floor
function incorrectly (at least, according to its type specification) returns floating point numbers when its input is not positive.
def triple(x: int) -> int:
return x + x + x
def floor(x: float) -> int:
return int(x) if x > 0 else x
(safe(triple), safe(floor))
Notice that this approach allows you to create a testing component that can handle failures during the testing process without raising an exception, making it possible to build long chains of tests for many functions without worrying about unexpected termination of the Python interpreter.
Reviewing the Python Enhancement Proposal associated with the type annotation feature provides important context about how the Python community views this feature and its future evolution. In particular, it is stated explicitly that it should not be inferred from the presence of this feature that Python will ever become a language that provides native support for type checking or requires type annotations. However, the community does envision the creation of user-built libraries for these purposes and views annotations as a standardized foundation.
Some third-party libraries leverage support for type annotations to provide static type checking and even type inference. The mypy library can annotate Python code with type information automatically, can type-check a program statically (without running it), and can introduce run-time type checks. Other community members have written libraries that allow programmers to perform both static and run-time type checking simply by adding a decorator. One example is enforce, though the project appears to be dormant and is no longer compatible with the typing library in the latest releases of Python.