4. Overview¶
Note
Throughout this overview and in certain other sections the examples provided are for Files-only/ development installations, however this is only to make it easier to use the inbuilt examples/ sample files rather than having to force the user to define his/ her own cleaning, learning, infering scripts, for the sake of simplicity.
If you are not using the Files-only/ development installation you will have to point nemesyst to cleaners, learners, predictors etc that you want to use. Although even if you are using Files-only/ development, eventually once you have better understood and tested Nemesyst then you should likeley move to creating your own ones that you require, and using a normal installation of Nemesyst such as one of the Automated examples.
4.1. Nemesyst literal un-abstract stages¶
Nemesyst has been made to be generic enough to handle many possible configurations, but we cannot possibly handle all possible scenarios. Sometimes it may be necessary to manually configure certain aspects of the process, especially regarding MongoDB as it is quite a well developed, mature, database, with more features than we could, and should automate.
4.2. Nemesyst Abstraction of stages¶
Deep learning can be said to include 3 stages, data-wrangling, test-training, and inferring. Nemesyst adds an extra layer we call serving, which is the stage at which databases are involved as the message passing interface (MPI), and generator, between the layers, machines, and algorithms, along with being the data, and model storage mechanism.
4.3. Nemesyst Parallelisation¶
As of: 2.0.1.r6.f9f92c3
Local parallelization of your scripts occur using pythons process pools from multiprocessing. This diagram shows how the rounds of processing are abstracted and the order of them. Rounds do not continue between stages, I.E if there is a spare process but not enough scripts from that stage (e.g cleaning) it will not fill this with a script process from the next stage (e.g learning). This is to prevent the scenario where a learning script may depend on the output of a previous cleaning script.
4.4. Wrangling / cleaning¶
See All Options by Category for a full list of options.
- Files-only/ development example:
nemesyst
4.5. Serving¶
See All Options by Category for a full list of options.
Nemesyst uses MongoDB databases through PyMongo as a data store, and distribution mechanism. The database(s) are some of the most important aspects of the chain of processes, as nothing can operate without a properly functioning database. As such we have attempted to simplify operations on both the user scripts side and our side by abstracting the slightly raw PyMongo interface into a much friendlier class of operations called Mongo.
A Mongo object is automatically passed into every one of your desired scripts entry points, so that you can also easily operate on the database if you so choose although aside from our data generator we handle the majority of use cases before it reaches your scripts.
- Automated example:
# creating basic non-config, non-replica, localhost, mongodb instance nemesyst --db-init --db-start --db-login --db-stop \ --db-user-name USERNAME --db-password \ --db-path DBPATH --db-log-path DBPATH/LOGDIR
Note
Please see Serving with MongoDB for more in depth serving with Nemesyst
4.6. Learning¶
See All Options by Category for a full list of options.
- Files-only/ development example:
nemesyst
Warning
Special attention should be paid to the size of the resultant neural networks. Beyond a certain size it will be necessary to store them as GridFS objects. The basic GridFS functionality is included in nemesyst’s Mongo however this is still experimental and should not be depended upon at this time.
4.7. Inferring / predicting¶
As of: 2.0.2.r7.1cf3eab
See All Options by Category for a full list of options.
- Files-only/ development example:
nemesyst