14 KiB

Исходник Ответственный История

Contributing Guidelines

This document describes the existing developer tooling we have in place (and what to expect of it), as well as our design and development philosophy.

Naming Conventions

Naming conventions are not automatically enforced, so please read the naming conventions section of PEP 8, which describes what each of the different styles means. A short summary of the most important parts:

Modules (and hence files) should have short, all-lowercase names.
Class (and exception) names should normally use the CapWords convention (also known as CamelCase).
Function and variable names should be lowercase, with words separated by underscores as necessary to improve readability (also known as snake_case).
To avoid collisions with the standard library, an underscore can be appended, such as id_.
Always use self for the first argument to instance methods.
Always use cls for the first argument to class methods.
One leading underscore like _data is for non-public methods and instance variables. And it can be used by sub-classes. If it won't be used in sub-classes, use like __data.
If there is a pair of get_x and set_x methods, they should instead be a proper property, which is easy to do with the built-in @property decorator.
Constants should be CAPITALIZED_SNAKE_CASE.
When importing a function, try to avoid renaming it with import as because it introduces cognitive overhead to track yet another name.

When in doubt, adhere to existing conventions, or check the style guide.

Automated Tooling

If you have ran LISAv3 already, then you have installed and used the poetry tool. Poetry is a PEP 518 compliant and cross-platform build system which handles our Python dependencies and environment.

This project’s dependencies are found in the pyproject.toml file. This is similar to but more powerful than the familiar requirements.txt. With PEP 518 and PEP 621.

Metadata

The first section, tool.poetry, defines the project’s metadata (name, version, description, authors, and license) which will be embedded in the final built package.

The chosen version follows Semantic Versioning, with the Python specific pre-release versioning suffix ‘.dev1’. Since this is “LISAv3” it seemed appropriate to set our version to ‘3.0.0.dev1’, that is, “the first development release of LISAv3.”

Package Dependencies

The next section, tool.poetry.dependencies, is where poetry add <package_name> records our required packages.

Poetry automatically creates and manages isolated environments.

From the documentation:

Poetry will first check if it’s currently running inside a virtual environment. If it is, it will use it directly without creating a new one. But if it’s not, it will use one that it has already created or create a brand new one for you.

On Linux, your initial run of poetry install will cause Poetry to automatically setup a new virtualenv using pyenv. If you are developing on Windows, you will want to setup your own, perhaps using Conda.

python: We pinned Python to version 3.8 so everyone uses the same version.
psutil: TODO @squirrelsc will document
pyyaml: TODO @squirrelsc will document
retry: TODO @squirrelsc will document
paramiko: TODO @squirrelsc will document
spurplus: TODO @squirrelsc will document
dataclasses-json: TODO @squirrelsc will document (brings in usjon which requires gcc and libpython)
portalocker: TODO @squirrelsc will document
azure-*: TODO @squirrelsc will document

Developer Dependencies

Similar to the previous section, tool.poetry.dev-dependencies is where poetry add --dev <package_name> records our developer packages. These are not necessary for LISAv3 to execute, but are used by developers to automatically adhere to our coding standards.

Black, the opinionated code formatter which settles all debates as to how our Python files should be formatted. It follows PEP 8, the official Python style guide, and where ambiguous makes the decision for us.
Flake8 (and integrations), the semantic analyzer, used to coordinate most of the other tools.
isort, the import sorter, which automatically splits imports into the expected, alphabetized sections.
mypy, the static type checker, which coupled with type annotations allows us to avoid the pitfalls of Python being a dynamically typed language.
python-language-server (and integrations), the de facto LSP server. While Microsoft is developing their own LSP servers, they do not integrate with the existing ecosystem of tools, and their latest tool, Pyright, simply does not support pyproject.toml. Since pyls is used far more widely, and supports every editor, we use it.
rope, to provide completions and renaming support to pyls.

With these packages installed and a correctly setup editor (see the readme and feel free to reach out to us), your code should automatically follow all the standards which we could automate.

The final sections, tool.black, tool.isort, build-system, and the .flake8 file (Flake8 does not yet support pyproject.toml) configure the tools per their recommendations.

Type Annotations

We are using mypy to enforce static type checking of our Python code. This may surprise you as Python is not a statically typed language. While dynamic typing can be useful, for a complex tool such as LISA it is more likely to introduce bugs that are found only at runtime (which the user experiences as a crash). For more information on why we (and others) do this, see Dropbox’s journey to type checking 4 million lines of Python. PEP 484 and PEP 526 (among others) introduced and defined type hints for the Python language. You can probably figuring out the syntax based on the surrounding code, but you can also see this Intro to Using Python Type Hints and mypy’s cheat sheet.

Runbook schema

Some plugins like Platform need follow this section to extend runbook schema. Runbook is the configurations of LISA runs. Every LISA run need a runbook.

The runbook uses dataclass to define, dataclass-json to deserialize, and marshmallow to validate the schema.

See more examples in schema.py, if you need to extend runbook schema.

Committing Guidelines

A best practice when using Git is to create a series of independent and well-documented commits. Each commit should “do one thing” and do it correctly. If a mistake is made (you need to fix a bug or adjust formatting), you should amend it (or use an interactive rebase to edit it). If you’re using Emacs, the Magit package makes all of this easy. Some of the reasons for making each commit polished is that it aids immensely in future debugging. It lets us use tools like git bisect to automatically find bugs, and understand why prior code was written. Although some of it has gone out of date, see this otherwise great essay on Git best practices. For how Git works, read Git from the Bottom Up.

For writing your commit messages, see this modification of Tim Pope’s example:

Capitalized, short (72 chars or less) summary

More detailed explanatory text, if necessary. Wrap it to about 72 characters or so. In some contexts, the first line is treated as the subject of an email and the rest of the text as the body. The blank line separating the summary from the body is critical (unless you omit the body entirely); tools like rebase can get confused if you run the two together.

Write your commit message in the imperative: “Fix bug” and not “Fixed bug” or “Fixes bug.” This convention matches up with commit messages generated by commands like git merge and git revert.

Further paragraphs come after blank lines.

Bullet points are okay, too

Typically a hyphen or asterisk is used for the bullet, followed by a single space, with blank lines in between, but conventions vary here

Use a hanging indent

You should also feel free to use Markdown in the commit messages, as our project is hosted on GitHub which renders it (and Markdown is human readable).

Design Patterns

The most important goal we are attempting to accomplish with LISAv3 is for it to be “simple, clean, and with a low maintenance cost.”

We should use caution when using Object Oriented Design, because when it is used without critical analysis, it creates unmaintainable code. A great talk on this subject is Stop Writing Classes, by Jack Diederich. As he says, “classes are great but they are also overused.”

This Python Design Patterns is a fantastic collection of material for writing maintainable Python code. It specifically details many of the common “Object Oriented” patterns from the Gang of Four book (which, in fact, were patterns geared toward languages like C++, and no longer apply to modern languages like Python), what lessons can be learned from them, and how to apply them (or their modern alternatives) today. It also serves as an easy-to-read guide to the Gang of Four book itself, as its principles still serve us well today.

Every time a developer chooses to use a design pattern, that person needs to reason through and document why it was chosen, and what alternatives were considered. We will recreate the problems with LISAv2 unless we take our time to carefully create a well-designed and maintainable framework.

Several popular patterns that actually do not work well in Python are:

Conversely, patterns that are a natural fit to Python include:

The Composite Pattern
The Iterator Pattern (caution: it is actually better to implement these with yield!)

Finally, a high-level guide to all things Python is The Hitchhiker’s Guide to Python. It covers just about everything in the Python world. If you make it through even some of these guides, you will be well on your way to being a “Pythonista” (a Python developer) writing “Pythonic” (canonically correct Python) code left and right.

Async IO

With Python 3.4, the Async IO pattern found in languages such as C# and Go is available through the keywords async and await, along with the Python module asyncio. Please read Async IO in Python: A Complete Walkthrough to understand at a high level how asynchronous programming works. As of Python 3.7, One major “gotcha” is that asyncio.run(...) should be used exactly once in main, it starts the event loop. Everything else should be a coroutine or task which the event loop schedules.

Future Sections

Just a collection of reminders for the author to expand on later.

unittest
doctest
subprocess
GitHub Actions
ShellCheck
Governance
Maintenance Cost
Parallelism and multi-plexing
Versioned inputs and outputs

14 KiB Исходник Ответственный История Убрать экранирование Экранировать