Abstract

PEP:423
Title:Naming conventions and recipes related to packaging
Version:$Revision$
Last-Modified:$Date$
Author:Benoit Bryon <benoit at marmelune.net>
Discussions-To:<distutils-sig at python.org>
Status:Deferred
Type:Informational
Content-Type:text/x-rst
Created:24-May-2012
Post-History:

This document deals with:

  • names of Python projects,
  • names of Python packages or modules being distributed,
  • namespace packages.

It provides guidelines and recipes for distribution authors:

PEP Deferral

Further consideration of this PEP has been deferred at least until after PEP 426 (package metadata 2.0) and related updates have been resolved.

Relationship with other PEPs

  • PEP 8 [2] deals with code style guide, including names of Python packages and modules. It covers syntax of package/modules names.
  • PEP 345 [3] deals with packaging metadata, and defines name argument of the packaging.core.setup() function.
  • PEP 420 [4] deals with namespace packages. It brings support of namespace packages to Python core. Before, namespaces packages were implemented by external libraries.
  • PEP 3108 [5] deals with transition between Python 2.x and Python 3.x applied to standard library: some modules to be deleted, some to be renamed. It points out that naming conventions matter and is an example of transition plan.

Overview

Here is a summarized list of guidelines you should follow to choose names:

If in doubt, ask

If you feel unsure after reading this document, ask Python community [6] on IRC or on a mailing list.

Top-level namespace relates to code ownership

This helps avoid clashes between project names.

Ownership could be:

  • an individual. Example: gp.fileupload [7] is owned and maintained by Gael Pasgrimaud.
  • an organization. Examples:
    • zest.releaser [8] is owned and maintained by Zest Software.
    • Django [9] is owned and maintained by the Django Software Fundation.
  • a group or community. Example: sphinx [10] is maintained by developers of the Sphinx project, not only by its author, Georg Brandl.
  • a group or community related to another package. Example: collective.recaptcha [12] is owned by its author: David Glick, Groundwire. But the "collective" namespace is owned by Plone community.

Respect ownership

Understand the purpose of namespace before you use it.

Don't plug into a namespace you don't own, unless explicitly authorized.

If in doubt, ask.

As an example, don't plug in "django.contrib" namespace because it is managed by Django's core contributors.

Exceptions can be defined by project authors. See Organize community contributions below.

Also, this rule applies to non-Python projects.

As an example, don't use "apache" as top-level namespace: "Apache" is the name of an existing project (in the case of "Apache", it is also a trademark).

Private (including closed-source) projects use a namespace

... because private projects are owned by somebody. So apply the ownership rule.

For internal/customer projects, use your company name as the namespace.

This rule applies to closed-source projects.

As an example, if you are creating a "climbing" project for the "Python Sport" company: use "pythonsport.climbing" name, even if it is closed source.

Individual projects use a namespace

... because they are owned by individuals. So apply the ownership rule.

There is no shame in releasing a project as open source even if it has an "internal" or "individual" name.

If the project comes to a point where the author wants to change ownership (i.e. the project no longer belongs to an individual), keep in mind it is easy to rename the project.

Community-owned projects can avoid namespace packages

If your project is generic enough (i.e. it is not a contrib to another product or framework), you can avoid namespace packages. The base condition is generally that your project is owned by a group (i.e. the development team) which is dedicated to this project.

Only use a "shared" namespace if you really intend the code to be community owned.

As an example, sphinx [10] project belongs to the Sphinx development team. There is no need to have some "sphinx" namespace package with only one "sphinx.sphinx" project inside.

In doubt, use an individual/organization namespace

If your project is really experimental, best choice is to use an individual or organization namespace:

  • it allows projects to be released early.
  • it won't block a name if the project is abandoned.
  • it doesn't block future changes. When a project becomes mature and there is no reason to keep individual ownership, it remains possible to rename the project.

Use a single name

Distribute only one package (or only one module) per project, and use package (or module) name as project name.

  • It avoids possible confusion between project name and distributed package or module name.

  • It makes the name consistent.

  • It is explicit: when one sees project name, he guesses package/module name, and vice versa.

  • It also limits implicit clashes between package/module names. By using a single name, when you register a project name to PyPI [11], you also perform a basic package/module name availability verification.

    As an example, pipeline [13], python-pipeline [14] and django-pipeline [15] all distribute a package or module called "pipeline". So installing two of them leads to errors. This issue wouldn't have occurred if these distributions used a single name.

Yes:

  • Package name: "kheops.pyramid", i.e. import kheops.pyramid
  • Project name: "kheops.pyramid", i.e. pip install kheops.pyramid

No:

  • Package name: "kheops"
  • Project name: "KheopsPyramid"

Note

For historical reasons, PyPI [11] contains many distributions where project and distributed package/module names differ.

Multiple packages/modules should be rare

Technically, Python distributions can provide multiple packages and/or modules. See setup script reference [16] for details.

Some distributions actually do. As an example, setuptools [17] and distribute [18] are both declaring "pkg_resources", "easy_install" and "site" modules in addition to respective "setuptools" and "distribute" packages.

Consider this use case as exceptional. In most cases, you don't need this feature. So a distribution should provide only one package or module at a time.

Distinct names should be rare

A notable exception to the Use a single name rule is when you explicitly need distinct names.

As an example, the Pillow [19] project provides an alternative to the original PIL [20] distribution. Both projects distribute a "PIL" package.

Consider this use case as exceptional. In most cases, you don't need this feature. So a distributed package name should be equal to project name.

Follow PEP 8 for syntax of package and module names

PEP 8 [2] applies to names of Python packages and modules.

If you Use a single name, PEP 8 [2] also applies to project names. The exceptions are namespace packages, where dots are required in project name.

Pick memorable names

One important thing about a project name is that it be memorable.

As an example, celery [21] is not a meaningful name. At first, it is not obvious that it deals with message queuing. But it is memorable, partly because it can be used to feed a RabbitMQ [22] server.

Pick meaningful names

Ask yourself "how would I describe in one sentence what this name is for?", and then "could anyone have guessed that by looking at the name?".

As an example, DateUtils [23] is a meaningful name. It is obvious that it deals with utilities for dates.

When you are using namespaces, try to make each part meaningful.

Use packaging metadata

Consider project names as unique identifiers on PyPI:

  • it is important that these identifiers remain human-readable.
  • it is even better when these identifiers are meaningful.
  • but the primary purpose of identifiers is not to classify or describe projects.

Classifiers and keywords metadata are made for categorization of distributions. Summary and description metadata are meant to describe the project.

As an example, there is a "Framework :: Twisted [24]" classifier. Even if names are quite heterogeneous (they don't follow a particular pattern), we get the list.

In order to Organize community contributions, conventions about names and namespaces matter, but conventions about metadata should be even more important.

As an example, we can find Plone portlets in many places:

  • plone.portlet.*
  • collective.portlet.*
  • collective.portlets.*
  • collective.*.portlets
  • some vendor-related projects such as "quintagroup.portlet.cumulus"
  • and even projects where "portlet" pattern doesn't appear in the name.

Even if Plone community has conventions, using the name to categorize distributions is inappropriate. It's impossible to get the full list of distributions that provide portlets for Plone by filtering on names. But it would be possible if all these distributions used "Framework :: Plone" classifier and "portlet" keyword.

Avoid deep nesting

The Zen of Python [25] says "Flat is better than nested".

Two levels is almost always enough

Don't define everything in deeply nested hierarchies: you will end up with projects and packages like "pythonsport.common.maps.forest". This type of name is both verbose and cumbersome (e.g. if you have many imports from the package).

Furthermore, big hierarchies tend to break down over time as the boundaries between different packages blur.

The consensus is that two levels of nesting are preferred.

For example, we have plone.principalsource instead of plone.source.principal or something like that. The name is shorter, the package structure is simpler, and there would be very little to gain from having three levels of nesting here. It would be impractical to try to put all "core Plone" sources (a source is kind of vocabulary) into the plone.source.* namespace, in part because some sources are part of other packages, and in part because sources already exist in other places. Had we made a new namespace, it would be inconsistently used from the start.

Yes: "pyranha"

Yes: "pythonsport.climbing"

Yes: "pythonsport.forestmap"

No: "pythonsport.maps.forest"

Use only one level for ownership

Don't use 3 levels to set individual/organization ownership in a community namespace.

As an example, let's consider:

  • you are pluging into a community namespace, such as "collective".
  • and you want to add a more restrictive "ownership" level, to avoid clashes inside the community.

In such a case, you'd better use the most restrictive ownership level as first level.

As an example, where "collective" is a major community namespace that "gergovie" belongs to, and "vercingetorix" it the name of "gergovie" author:

No: "collective.vercingetorix.gergovie"

Yes: "vercingetorix.gergovie"

Don't use more than 3 levels

Technically, you can create deeply nested hierarchies. However, in most cases, you shouldn't need it.

Note

Even communities where namespaces are standard don't use more than 3 levels.

Register names with PyPI

PyPI [11] is the central place for distributions in Python community. So, it is also the place where to register project and package names.

See Registering with the Package Index [27] for details.

Recipes

The following recipes will help you follow the guidelines and conventions above.

How to check for name availability?

Before you choose a project name, make sure it hasn't already been registered in the following locations:

  • PyPI [11]
  • that's all. PyPI is the only official place.

As an example, you could also check in various locations such as popular code hosting services, but keep in mind that PyPI is the only place you can register for names in Python community.

That's why it is important you register names with PyPI.

Also make sure the names of distributed packages or modules haven't already been registered:

The use a single name rule also helps you avoid clashes with package names: if a project name is available, then the package name has good chances to be available too.

How to rename a project?

Renaming a project is possible, but keep in mind that it will cause some confusions. So, pay particular attention to README and documentation, so that users understand what happened.

  1. First of all, do not remove legacy distributions from PyPI. Because some users may be using them.
  2. Copy the legacy project, then change names (project and package/module). Pay attention to, at least:
    • packaging files,
    • folder name that contains source files,
    • documentation, including README,
    • import statements in code.
  3. Assign Obsoletes-Dist metadata to new distribution in setup.cfg file. See PEP 345 about Obsolete-Dist [29] and setup.cfg specification [30].
  4. Release a new version of the renamed project, then publish it.
  5. Edit legacy project:
    • add dependency to new project,
    • drop everything except packaging stuff,
    • add the Development Status :: 7 - Inactive classifier in setup script,
    • publish a new release.

So, users of the legacy package:

  • can continue using the legacy distributions at a deprecated version,
  • can upgrade to last version of legacy distribution, which is empty...
  • ... and automatically download new distribution as a dependency of the legacy one.

Users who discover the legacy project see it is inactive.

Improved handling of renamed projects on PyPI

If many projects follow Renaming howto recipe, then many legacy distributions will have the following characteristics:

  • Development Status :: 7 - Inactive classifier.
  • latest version is empty, except packaging stuff.
  • latest version "redirects" to another distribution. E.g. it has a single dependency on the renamed project.
  • referenced as Obsoletes-Dist in a newer distribution.

So it will be possible to detect renamed projects and improve readability on PyPI. So that users can focus on active distributions. But this feature is not required now. There is no urge. It won't be covered in this document.

How to apply naming guidelines on existing projects?

There is no obligation for existing projects to be renamed. The choice is left to project authors and mainteners for obvious reasons.

However, project authors are invited to:

State about current naming

The important thing, at first, is that you state about current choices:

  • Ask yourself "why did I choose the current name?", then document it.
  • If there are differences with the guidelines provided in this document, you should tell your users.
  • If possible, create issues in the project's bugtracker, at least for record. Then you are free to resolve them later, or maybe mark them as "wontfix".

Projects that are meant to receive contributions from community should also organize community contributions.

Promote migrations

Every Python developer should migrate whenever possible, or promote the migrations in their respective communities.

Apply these guidelines on your projects, then the community will see it is safe.

In particular, "leaders" such as authors of popular projects are influential, they have power and, thus, responsibility over communities.

Apply these guidelines on popular projects, then communities will adopt the conventions too.

Projects should promote migrations when they release a new (major) version, particularly if this version introduces support for Python 3.x, new standard library's packaging or namespace packages.

Opportunity

As of Python 3.3 being developed:

  • many projects are not Python 3.x compatible. It includes "big" products or frameworks. It means that many projects will have to do a migration to support Python 3.x.
  • packaging (aka distutils2) is on the starting blocks. When it is released, projects will be invited to migrate and use new packaging.
  • PEP 420 [4] brings official support of namespace packages to Python.

It means that most active projects should be about to migrate in the next year(s) to support Python 3.x, new packaging or new namespace packages.

Such an opportunity is unique and won't come again soon! So let's introduce and promote naming conventions as soon as possible (i.e. now).

References

Additional background:

References and footnotes:

[1]http://docs.python.org/dev/packaging/introduction.html#general-python-terminology
[2](1, 2, 3) http://www.python.org/dev/peps/pep-0008/#package-and-module-names
[3]http://www.python.org/dev/peps/pep-0345/
[4](1, 2) http://www.python.org/dev/peps/pep-0420/
[5]http://www.python.org/dev/peps/pep-3108/
[6]http://www.python.org/community/
[7]http://pypi.python.org/pypi/gp.fileupload/
[8]http://pypi.python.org/pypi/zest.releaser/
[9]http://djangoproject.com/
[10](1, 2) http://sphinx.pocoo.org
[11](1, 2, 3, 4) http://pypi.python.org
[12]http://pypi.python.org/pypi/collective.recaptcha/
[13]http://pypi.python.org/pypi/pipeline/
[14]http://pypi.python.org/pypi/python-pipeline/
[15]http://pypi.python.org/pypi/django-pipeline/
[16]http://docs.python.org/dev/packaging/setupscript.html
[17]http://pypi.python.org/pypi/setuptools
[18]http://packages.python.org/distribute/
[19]http://pypi.python.org/pypi/Pillow/
[20]http://pypi.python.org/pypi/PIL/
[21]http://pypi.python.org/pypi/celery/
[22]http://www.rabbitmq.com
[23]http://pypi.python.org/pypi/DateUtils/
[24]http://pypi.python.org/pypi?:action=browse&show=all&c=525
[25]http://www.python.org/dev/peps/pep-0020/
[26]http://plone.org/community/develop
[27]https://docs.python.org/3/distutils/packageindex.html
[28]http://docs.python.org/library/index.html
[29]http://www.python.org/dev/peps/pep-0345/#obsoletes-dist-multiple-use
[30]http://docs.python.org/dev/packaging/setupcfg.html
[31]http://www.martinaspeli.net/articles/the-naming-of-things-package-names-and-namespaces
[32]http://docs.python.org/dev/packaging/
[33]http://guide.python-distribute.org/specification.html#naming-specification