Raghav K. Chhetri
Keller Lab, HHMI Janelia
03/26/2020
This tutorial is based on
Python: v3.7
OS: 64-bit Windows 10
Make minor modifications wherever necessary if using other platforms
Download Anaconda installer and proceed with installation
Once installation is complete, Python is ready to be launched. Two options:
From the Windows Start menu, search for and open “Anaconda Prompt”:
(base) C:\Users\Raghav>where python
(base) C:\Users\Raghav>python –-version
(base) C:\Users\Raghav>conda list
(base) C:\Users\Raghav>conda update conda
Several options to do so:
(base) C:\Users\Raghav>python
(base) C:\Users\Raghav>ipython
(base) C:\Users\Raghav>spyder
(base) C:\Users\Raghav>jupyter notebook
firstenv
here) (base) C:\Users\Raghav>conda create –-name firstenv python=3.7 anaconda
(base) C:\Users\Raghav>conda activate firstenv
(firstenv)C:\Users\Raghav>
(firstenv) C:\Users\Raghav>conda deactivate
(base) C:\Users\Raghav>
(base) C:\Users\Raghav>create --name secondenv –-clone firstenv
conda env remove --name firstenv
(base) C:\Users\Raghav>conda info --envs
Press Shift + Enter to run the code cell
x = [1, 2, 3, 4, 5]
y = x
print(x,y)
x.append(6)
print(x,y)
The value of y is also updated since both variables point to the same object in memory
x=['a','b','c']
print(x, y)
In the above cell, a new value was assigned to 'x', so it now points to another location in memory, whereas 'y' still points to the previous object in memory
Takeaway: variables in Python are pointers to objects in memory
Import packages
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
'Shift + Tab' will show you the Docstring for the object you just typed in a code cell. With 'Shift' pressed, press 'Tab' again to cycle through more documentation
x = np.linspace(-np.pi, np.pi, 201)
print(x.dtype,type(x),len(x))
plt.plot(x, np.sin(x))
plt.xlabel('Angle [rad]'); plt.ylabel('sin(x)'); plt.axis('tight')
plt.show()
inPath = 'Y:/Exchange/Raghav/tutorial_python/data/'
from os.path import join
from glob import glob
from skimage.io import imread, imsave
#Grab filenames for all cameras
fnames_cm00 = glob(join(inPath, '*CM00*.tif')); fnames_cm00.sort()
fnames_cm01 = glob(join(inPath, '*CM01*.tif')); fnames_cm01.sort()
fnames_cm02 = glob(join(inPath, '*CM02*.tif')); fnames_cm02.sort()
fnames_cm03 = glob(join(inPath, '*CM03*.tif')); fnames_cm03.sort()
print(fnames_cm00[0])
len(fnames_cm00)
TM=0
cm00stack = imread(fnames_cm00[TM])
cm01stack = imread(fnames_cm01[TM])
cm02stack = imread(fnames_cm02[TM])
cm03stack = imread(fnames_cm03[TM])
print(type(cm00stack), cm00stack.shape, cm00stack.dtype) #Dimensions of (Z,Y,X)
print(type(cm03stack), cm03stack.shape, cm03stack.dtype)
Define a function to show several images from two imagestacks:
def plot_image_pairs(stackA, stackB, cam_list, index_type, nrows=2, ncols=8, stride=8, vmin=90, vmax=300):
'''
This function shows several images from two imagestacks
Example: plot_image_pairs(cm00stack, cm01stack, ('CM00', 'CM01'), 'z')
'''
fig, ax = plt.subplots(nrows=nrows, ncols=ncols,figsize=(21,12))
for j in range(0,ncols):
ax[0,j].imshow(stackA[stride*(j+1),:,:], cmap='gray', vmin=vmin, vmax=vmax)
ax[0,j].axis('off')
ax[0,j].set_title( cam_list[0] +', ' +index_type +'=%d' % (stride*(j+1)))
ax[1,j].imshow(stackB[stride*(j+1),:,:], cmap='gray',vmin=vmin, vmax=vmax)
ax[1,j].axis('off')
ax[1,j].set_title(cam_list[1] + ', ' +index_type +'=%d' % (stride*(j+1)))
plot_image_pairs(cm00stack, cm01stack, ('CM00', 'CM01'), 'z')
plot_image_pairs(cm02stack, cm03stack, ('CM02', 'CM03'), 'z')
pyklb
is a Python wrapper to read/write klb files. On 64-bit Windows running Python 3.7, install as follows:
pyklb-master
to your local drive C:\Users\
firstenv
here) \dist\
folder in your local pyklb-master
cd pyklb-master\dist
pip
install the wheel present in this folder pip install pyklb-0.0.3.dev0-cp37-cp37m-win_amd64.whl
klb.dll
to your environment’s \lib\site-packages\
folder C:\Anaconda3\envs\firstenv\lib\site-packages\
If you are on a different OS or using a different Python version, build and install everything from scratch from here
Let’s now open a .klb file in Python to confirm that things are working as expected. Below, I’m using ipython
for this:
(base) C:\Users\Raghav>conda activate firstenv
(firstenv) C:\Users\Raghav>ipython
In [1]: import pyklb as klb
In [2]: img = klb.readfull('y:/.../...klb’)
In [3]: type(img)
In [4]: img.shape
In [5]: help(klb)
import pyklb as klb
img = klb.readfull('Y:/Documents/Slides/Raghav/tutorial_python/test.klb')
print(type(img), img.shape, img.dtype)
You'll notice that the file is loaded in Python as a numpy N-dimensional array. From its shape, we can tell that the numpy array represents a 3D image-stack. Recall that klb can support up to 5D arrays, and the dimension convetion is as follows: [tm,ch,z,y,x]
For this example 3D image-stack, the dimenion convention is thus [z,y,x]
Memory locations
tmp = img
print(hex(id(img)), hex(id(tmp)))
Note that both 'img' and 'tmp' are pointing to the same memory location.
However, if we were to invoke the 'copy' method, then a separate copy is created and mapped to a different memory location.
foo = np.copy(img)
print(hex(id(img)), hex(id(foo)))
img2 = ((img/np.max(img))*255).astype(np.uint8)
print(type(img2), img2.shape, img2.dtype)
print(hex(id(img)), hex(id(img2)))
Above, we took the original 'img' array and changed it to 8-bit and saved it as 'img2'
So, a new array was created in the process, and thus the memory locations of 'img' and 'img2' are different.
Save the 8-bit image as a new .klb file:
klb.writefull(img2, filepath='Y:/Exchange/Raghav/tutorial_python/img-8bit.klb')
Note: you can seek more info by calling the 'help' function as: help(klb)
Specify the sampling of your data in microns:
klb.writefull(img2, filepath='Y:/Exchange/Raghav/tutorial_python/img-8bit-um.klb',
pixelspacing_tczyx = (4.06125,0.8125,0.8125))
img3 = klb.readfull('Y:/Exchange/Raghav/tutorial_python/img-8bit-um.klb')
To display an image, we'll use matplotlib, a library for creating visualizations in Python.
import matplotlib.pyplot as plt
#Show the graphics in the notebook
%matplotlib inline
Let's grab a single plane from the 3D image-stack and display it in the notebook as follows
zslice = img3[24,:,:] # This is an example of slicing
Note that the index in Python starts at 0, so if you want to grab the first z-plane, here is how you would do it
first_zslice = img3[0,:,:]
And, the last z-plane can be grabbed using an index of '-1'
last_zslice = img3[-1,:,:]
plt.figure(figsize=(8,8))
plt.imshow(zslice,cmap='gray',interpolation='none')
plt.colorbar()
plt.show()
Select a ROI of the image by slicing the numpy array
roi = zslice[180:230,0:50]
print(type(roi),roi.mean(), roi.min(), roi.max())
plt.imshow(roi,cmap='gray',interpolation='none')
plt.show()
Instead of looking at a single image, let's display several images from our 3D image-stack on a subplot. For this, we'll write a function and we'll also use a built-in function called 'range' 'which generates a sequence of numbers as shown here
a = range(0,5)
print(type(a))
list(a)
Define a function to show several images from an imagestack:
def plot_images(stackA, axial_label, stride=8, vmin=60, vmax=255, cmap='hot'):
'''
This function shows several images from an imagestack
Example: plot_images(cm00stack, 'z')
'''
ncols = int(stackA.shape[0]/stride)
fig, ax = plt.subplots(1,ncols,figsize=(21,12))
for j in range(0,ncols):
ax[j].imshow(stackA[stride*(j),:,:], cmap=cmap, interpolation ='none', vmin=vmin, vmax=vmax)
ax[j].axis('off')
ax[j].set_title(axial_label +'=%d' % (stride*j))
plot_images(img3,'Zplane',stride=16)
Before we look at orthogonal views of our 3D image-stack, let's first generate a random 3x3 matrix and get familiar with numpy's 'transpose' method
a = np.random.randint(0,10,size=(3,3))
print(a)
a.transpose(), a.T
hex(id(a)), hex(id(a.transpose())), hex(id(a.T))
#Transpose an image-stack
imgY = img3.T
print('Original shape',img3.shape)
print('After transpose',imgY.shape)
The transposed 3D stack has the X and Z dimensions swapped. This is a 90 degree rotation about the Y-axis.
plot_images(imgY, axial_label='Y', stride =16)
Now swap the first two dimensions (Z and Y) to get a 90 degree rotation about X-axis
imgX=img3.transpose((1,0,2))
plot_images(imgX, axial_label='X', stride=64)
print('Original shape',img3.shape)
print('After transpose',imgX.shape)
Memory layout of the orignal 3D array and transposed 3D arrays that we generated:
print(img3.flags)
print(imgX.flags)
The arrays need to be 'C_CONTIGUOUS', otherwise you'll get an error when you try to save the array as klb. So, let's make them so before saving the rotated image-stacks
imgX = np.ascontiguousarray(imgX)
imgY = np.ascontiguousarray(imgY)
print(imgX.flags['C_CONTIGUOUS'])
print(imgY.flags['C_CONTIGUOUS'])
klb.writefull(imgX, filepath='Y:/Exchange/Raghav/tutorial_python/imgX-8bit-um.klb')
klb.writefull(imgY, filepath='Y:/Exchange/Raghav/tutorial_python/imgY-8bit-um.klb')
You'll soon miss the interactivity that we are accustomed to with ImageJ/Fiji and 'matplotlib' doesn't offer those features. An image-viewer for Python called 'napari' fills that void.
You can learn more on how to install 'napari' here: https://napari.org/
They also have several tutorials to get you started. It'd be benficial to familiarize yourself with another library in Python called Dask (dask array in particular). Dask is already installed in your Python environment, so you can start using it right away!
Dask arrays are like Numpy arrays, with the key difference being that they can be arbirtrily chunks within a grid. Go through '01_dask_array.ipynb' for more on this topic
Dask arrays coordinate many Numpy arrays, arranged into chunks within a grid. They support a large subset of the Numpy API.
from dask.distributed import Client, progress
client = Client(processes=False, threads_per_worker=8,
n_workers=8, memory_limit='12GB')
client
import dask.array as da
import time
x = da.random.random((10000, 10000), chunks=(1000, 1000))
x
y = x + x.T
z = y[::2, 5000:].mean(axis=1)
z
Call .compute()
when you want your result as a NumPy array.
z.compute()
If you have available RAM for your dataset then you can persist data in memory. This allows future computations to be much faster.
y = y.persist()
%%time
y[0, 0].compute()
%%time
y.sum().compute()
These functions do simple operations like add two numbers together, but they sleep for a random amount of time to simulate real work.
import random
def inc(x):
time.sleep(random.random())
return x + 1
def dec(x):
time.sleep(random.random())
return x - 1
def add(x, y):
time.sleep(random.random())
return x + y
We can run them like normal Python functions below
%%time
x = inc(1); y = dec(2); z = add(x, y)
z
These ran one after the other, in sequence. Note though that the first two lines inc(1) and dec(2) don't depend on each other, we could have called them in parallel had we been clever.
We can call dask.delayed
on our funtions to make them lazy. Rather than compute their results immediately, they record what we want to compute as a task into a graph that we'll run later on parallel hardware.
import dask
inc = dask.delayed(inc)
dec = dask.delayed(dec)
add = dask.delayed(add)
Calling these lazy functions is now almost free. We're just constructing a graph
xx = inc(1); yy = dec(2); zz = add(xx, yy)
zz
%%time
zz.compute()
https://github.com/WhoIsJack/python-bioimage-analysis-tutorial
Biopython is a library of computational molecular biology tools.
conda install -c conda-forge biopython
# Example Script:
# Import the sequence and alphabet toolboxes from the Biopython library
from Bio.Seq import Seq
from Bio.Alphabet import IUPAC
# Define a sequence as a DNA Seq object
seq = Seq("CCTCAGCGAGGACAGCAAGGGACTAGCC", IUPAC.unambiguous_dna)
# We've assigned a DNA alphabet to our sequence
alphabet = seq.alphabet
complement = seq.complement()
reverse_complement = seq.reverse_complement()
# Use the transcribe() method to switch alphabets to RNA
mRNA = seq.transcribe()
# Use the translate() method to get the corresponding protein sequence
protein = mRNA.translate()