API Documentation

This is the API documentation of the public API of mmappickle. These are the most commonly used functions when using mmappickle.

Developers interested in extending mmappickle, or in understanding the inner workings should read the following page: Internals.

Main module

class mmappickle.mmapdict(file, readonly=None, picklers=None)[source]

class to access a mmap-able dictionnary in a file.

This class is safe to use in a multi-process environment.

__init__(file, readonly=None, picklers=None)[source]

Create or load a mmap dictionnary.

Parameters:
  • file – either a file-like object or a string representing the name of the file.
  • readonly – if file is a string, the file will be open in readonly mode if set to True.
  • picklers – explicit list of picklers. Usually this is not needed (by default, all are used)
writable

True if the file is writable, False otherwise

commit_number

The monotonically increasing commit number of the mmapdict.

This is useful to know if the keys have been changed by another process. If the commit_number hasn’t changed, it is guaranteed that keys() won’t be changed.

Altough it is possible to set the commit number using this property, there is generally no use for this in external code.

__contains__(k)[source]

Check if a key exists in dictionnary

Parameters:k – Key (string) to check for existence
Returns:True if key exists in dictionnary, False otherwise.
__weakref__

list of weak references to the object (if defined)

keys()[source]
Returns:a set-like object providing a view on D’s keys
__setitem__(k, v)[source]

Create or change key k, sets its value to v.

Parameters:
  • k – key, should be an unicode string of binary length <= 255.
  • v – value, any picklable object

When replacing a value, this function adds the new key-value pair at the end of the file, and marks the old one as invalid, but leaves the data in place. As a consequence, this function can be used when using the file concurrently from multiple processes. However, other processes may still be using the old value if they don’t reload the value from the file.

If no concurrent access exists to the file, the old value can be freed using vacuum().

__getitem__(k)[source]

Get value for key k, raise KeyError if the key doesn’t exists in file.

If possible, the data will be returned as a mmap’ed object.

__delitem__(k)[source]

Mark key k as not valid in the file.

Parameters:k – key to remove

This method marks the key as invalid, but leaves the data in place. As a consequence, this function can be used when using the file concurrently from multiple processes. However, other processes may still be using the value if they don’t reload the keys from the file.

If no concurrent access exists to the file, the old value can be freed using vacuum().

vacuum(chunk_size=1048576)[source]

Free all deleted keys, effectively reclaiming disk space.

Only use this function when no mmap exists on the file. Usually it is safer to run it only in part of the code where there is no concurrent access.

Parameters:chunk_size – The size of the buffer used to shift data in the file.

Warning

No mmap should exist on this file (both in this python script, and in others), as the data will be shifted.

If an mmap exists, it could crash the process and/or corrupt the file and/or return invalid data.

fsck()[source]

Attempt to fix the file, if possible.

This function should be called if some data could not be written to a file. This might be the case if, for example, not enough disk space was available.

This method truncates the file and recreates a valid terminator.

Warning

Calling this function may lead to data loss.

Stubs