.. _io-tutorial:

Getting Started with :mod:`dip.io`
==================================

The :mod:`dip.io` module serves two purposes.  Firstly it provides the support
to applications to enable the reading and writing of models.  Secondly it
provides support for implementing new :term:`formats<format>` and forms of
:term:`storage` so that they can be used by existing applications without
needing to make any changes to those applications.  The basis of all this
support is an :term:`i/o manager` which is an object that implements or can be
adapted to the :class:`~dip.io.IIoManager` interface.


Concepts
--------

dip considers storage to be either :term:`streaming storage` or
:term:`structured storage`.  Streaming storage stores data as a byte stream.
A filesystem is the most common example of streaming storage.  Structured
storage stores data in a storage specific structure.  An SQL database would be
an example of structured storage.

A model is stored according to a data format.  When a model is written it is
encoded according to the format.  When it is read it is decoded from the
format.  A model will normally have a native format which is used when reading
and writing the model.  It may be possible to export a model to other formats
and to import a model from other formats.  Each format has a unique string
identifier.

A :term:`codec` is an object that implements an encoder and/or a decoder for a
particular format.  Streaming storage has no implicit format and so it is used
in conjunction with an explicit codec.  The codec effectively imposes the
structure of the data on top of the byte stream.  Structured storage has,
by definition, an implicit format and, therefore, an implicit codec.

A codec specifies encoder and decoder interfaces that a model must implement,
or be able to be adapted to, in order to be encoded and decoded.

A model is stored at a :term:`storage location` that is unique within a
particular piece of storage.  For example, the location of a model stored in a
filesystem is the absolute path name of the file containing the encoded model.
A storage location may be implicit.  This means that where a model is stored is
determined by the value of the model and is not specified by the user.


Reading and Writing a Model
---------------------------

In :ref:`shell-tutorial` we briefly described the steps to be taken to ensure
that a model could be written to and read from storage.  Here we will
reiterate those steps.

- Unless one already exists, create a codec, i.e. an object that implements the
  :class:`~dip.io.ICodec` interface, for the model's native format.  A codec
  will specify decoder and encoder interfaces that a model must implement or be
  able to be adapted to.

- Ensure that the model implements the decoder and encoder interfaces.  This is
  normally done by creating appropriate adapters.  Whether or not a single
  adapter is used for both interfaces is a matter of personal programming
  style.

- Create an instance of the codec and add it to the i/o manager's list of
  codecs.

Note that some codecs may only support decoding or encoding but not both.  Such
codecs would not be used as a model's native format but they would still be
useful when importing from or exporting to non-native formats.

dip provides two codecs that between them cover many common cases.

- The :class:`~dip.io.codecs.unicode.UnicodeCodec` codec, with its
  :class:`~dip.io.codecs.unicode.IUnicodeDecoder` decoder interface and
  :class:`~dip.io.codecs.unicode.IUnicodeEncoder` encoder interface, stores a
  model as a Unicode byte stream.  By default it uses the UTF-8 encoding.

- The :class:`~dip.io.codecs.xml.XmlCodec` codec, with its
  :class:`~dip.io.codecs.xml.IXmlDecoder` decoder interface and
  :class:`~dip.io.codecs.xml.IXmlEncoder` encoder interface, stores a
  :class:`~dip.model.Model` instance as XML.


Implementing a New Type of Storage
----------------------------------

The need to implement a new type of storage arises less often than the need to
implement a new codec.  When the need does arise it is typically as a result of
some new technology or service becoming available that can be used by many
applications rather than something that is application specific.  For example,
an organisation may subscribe to a cloud based file service.  A new type of
storage would then be implemented to provide access to it.  All existing
applications could then use it without making changes to those applications.

The other situation that would require a new type of storage to be implemented
is when a database is being used as :term:`structured storage`.

In this section we describe the high level steps taken to implement a new type
of storage, including the interfaces and classes that need to be written.

The :class:`~dip.io.IStreamingStorageFactory` interface must be implemented by
a streaming storage factory, and its :meth:`~dip.io.IStorageFactory.__call__`
method must return an implementation of the :class:`~dip.io.IStorage`
interface.

The :class:`~dip.io.IStructuredStorageFactory` interface must be implemented by
a structured storage factory, and its :meth:`~dip.io.IStorageFactory.__call__`
method must also return an implementation of the
:class:`~dip.io.IStorage` interface.

The :class:`~dip.io.IStorage` interface defines :meth:`~dip.io.IStorage.read`
and :meth:`~dip.io.IStorage.write` methods to do the reading and writing of an
object from and to a specific storage location.

:class:`~dip.io.IStorage` also defines the :attr:`~dip.io.IStorage.ui`
attribute which is an implementation of the :class:`~dip.io.IStorageUi`
interface.  This interface defines methods that create the necessary user
interfaces that the user will use to select a storage location.  For example,
the filesystem storage type included with dip provides access (assuming the
default PyQt4 toolkit) to :class:`~PyQt4.QtGui.QFileDialog` using this
mechanism.  A storage type that handled a database may implement a database
browser.


Defining a Storage Policy
-------------------------

Sometimes you may have a situation where a model can be read from or written
to a particular storage type, but you want to place restrictions on that
access and the options presented to the user.  For example, a certain type of
user may only be able to read from the storage, or access to the storage may
be limited to certain times of the day, or you simply wish to prevent a certain
type of model from ever being written a certain type of storage.

The i/o manager will consult a list of :term:`storage policies<storage policy>`
to determine if a model using a particular format should actually be allowed to
be read from or written to a particular storage instance.  A policy is a
callable that is passed the format and the storage instance and should return
``True`` if the access is permitted.  If any policy returns ``False`` then the
access is denied.
