×

Owens Information Transition

Owens cluster will be decommissioned on February 3, 2025. Some pages may still reference Owens after Owens is decommissioned , and we are in the process of gradually updating the content. Thank you for your patience during this transition

Caffe

Caffe is "a fast open framework for deep learning."

From their README:

Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by community contributors. Yangqing Jia created  the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license.

Caffe also includes interfaces for both Python and Matlab, which have been built but have not been tested.

Availability and Restrictions

Versions

The following versions of Caffe are available on OSC clusters:

Version Owens
1.0.0-rc3 X*
* Current Default Version

You can use module spider caffe to view available modules for a given machine. Feel free to contact OSC Help if you need other versions for your work.

The current version of Caffe on Owens requires cuda/8.0.44 for GPU calculations.

Access 

Caffe is available to all OSC users. If you have any questions, please contact OSC Help.

Publisher/Vendor/Repository and License Type

Berkeley AI Research, Open source

Usage

Usage on Owens

Setup on Owens

To configure the Owens cluster for the use of Caffe, use the following commands:

module load caffe

Batch Usage on Owens

Batch jobs can request mutiple nodes/cores and compute time up to the limits of the OSC systems. Refer to Queues and Reservations for Owens, and Scheduling Policies and Limits for more info.  In particular, Caffe should be run on a GPU-enabled compute node.

An Example of Using Caffe with MNIST Training Data on Owens

Below is an example batch script (job.txt) for using Caffe, see this resource for a detailed explanation http://caffe.berkeleyvision.org/gathered/examples/mnist.html

#!/bin/bash
#SBATCH --job-name=Caffe
#SBATCH --nodes=1 --ntask-per-node=28:gpu
#SBATCH --time=30:00
#SBATCH --account <project-account>

. /etc/profile.d/lmod.sh
# Load the modules for Caffe
ml caffe
# Migrate to job temp directory and copy folders over
cd $TMPDIR
cp -r $CAFFE_HOME/{examples,data} .
# Download, create, train
./data/mnist/get_mnist.sh
./examples/mnist/create_mnist.sh
./examples/mnist/train_lenet.sh
# Serialize log files
echo; echo 'LOG 1'
cat convert_mnist_data.bin.$(hostname)*
echo; echo 'LOG 2'
cat caffe.INFO
echo; echo 'LOG 3'
cat convert_mnist_data.bin.INFO
echo; echo 'LOG 4'
cat caffe.$(hostname).*
cp examples/mnist/lenet_iter_10000.* $SLURM_SUBMIT_DIR

In order to run it via the batch system, submit the job.txt file with the following command:

sbatch job.txt

Further Reading

Supercomputer: 
Service: 
Technologies: 
Fields of Science: