{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# PBxplore API cookbook --- Visualize protein deformability"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Protein Blocks are great tools to study protein deformability. Indeed, if the block assigned to a residue changes between two frames of a trajectory, it represents a local deformation of the protein rather than the displacement of the residue.\n",
    "\n",
    "The PBxplore API allows to visualize Protein Block variability throughout a molecular dynamics simulation trajectory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from __future__ import print_function, division\n",
    "from pprint import pprint\n",
    "from IPython.display import Image, display\n",
    "import matplotlib.pyplot as plt\n",
    "import os\n",
    "\n",
    "# The following line, in a jupyter notebook, allows to display\n",
    "# the figure directly in the notebook. See <https://jupyter.org/>\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import pbxplore as pbx"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we will look at a molecular dynamics simulation of the barstar. As we will analyse Protein Block sequences, we first need to assign these sequences for each frame of the trajectory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Assign PB sequences for all frames of a trajectory\n",
    "trajectory = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.xtc')\n",
    "topology = os.path.join(pbx.DEMO_DATA_PATH, 'barstar_md_traj.gro')\n",
    "sequences = []\n",
    "for chain_name, chain in pbx.chains_from_trajectory(trajectory, topology):\n",
    "    dihedrals = chain.get_phi_psi_angles()\n",
    "    pb_seq = pbx.assign(dihedrals)\n",
    "    sequences.append(pb_seq)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Block occurences per position\n",
    "\n",
    "The basic information we need to analyse protein deformability is the count of occurences of each PB for each position throughout the trajectory. This occurence matrix can be calculated with the :func:`pbxplore.analysis.count_matrix` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "count_matrix = pbx.analysis.count_matrix(sequences)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "``count_matrix`` is a numpy array with one row per PB and one column per position. In each cell is the number of time a position was assigned to a PB.\n",
    "\n",
    "We can visualize ``count_matrix`` using Matplotlib as any 2D numpy array."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<matplotlib.text.Text at 0x10c5c8ad0>"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/jon/Envs/pbxtest/lib/python2.7/site-packages/matplotlib/collections.py:590: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison\n",
      "  if self._edgecolors == str('face'):\n"
     ]
    },
    {
     "ename": "RuntimeError",
     "evalue": "Could not create write struct",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mRuntimeError\u001b[0m                              Traceback (most recent call last)",
      "\u001b[0;32m/Users/jon/Envs/pbxtest/lib/python2.7/site-packages/IPython/core/formatters.pyc\u001b[0m in \u001b[0;36m__call__\u001b[0;34m(self, obj)\u001b[0m\n\u001b[1;32m    335\u001b[0m                 \u001b[0;32mpass\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    336\u001b[0m             \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 337\u001b[0;31m                 \u001b[0;32mreturn\u001b[0m \u001b[0mprinter\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    338\u001b[0m             \u001b[0;31m# Finally look for special method names\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    339\u001b[0m             \u001b[0mmethod\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0m_safe_get_formatter_method\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mobj\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprint_method\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/Users/jon/Envs/pbxtest/lib/python2.7/site-packages/IPython/core/pylabtools.pyc\u001b[0m in \u001b[0;36m<lambda>\u001b[0;34m(fig)\u001b[0m\n\u001b[1;32m    205\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    206\u001b[0m     \u001b[0;32mif\u001b[0m \u001b[0;34m'png'\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mformats\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 207\u001b[0;31m         \u001b[0mpng_formatter\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfor_type\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mFigure\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mfig\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mprint_figure\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfig\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m'png'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    208\u001b[0m     \u001b[0;32mif\u001b[0m \u001b[0;34m'retina'\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mformats\u001b[0m \u001b[0;32mor\u001b[0m \u001b[0;34m'png2x'\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mformats\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    209\u001b[0m         \u001b[0mpng_formatter\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfor_type\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mFigure\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;32mlambda\u001b[0m \u001b[0mfig\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mretina_figure\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfig\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkwargs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/Users/jon/Envs/pbxtest/lib/python2.7/site-packages/IPython/core/pylabtools.pyc\u001b[0m in \u001b[0;36mprint_figure\u001b[0;34m(fig, fmt, bbox_inches, **kwargs)\u001b[0m\n\u001b[1;32m    115\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    116\u001b[0m     \u001b[0mbytes_io\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mBytesIO\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 117\u001b[0;31m     \u001b[0mfig\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcanvas\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mprint_figure\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbytes_io\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;34m**\u001b[0m\u001b[0mkw\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    118\u001b[0m     \u001b[0mdata\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mbytes_io\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mgetvalue\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    119\u001b[0m     \u001b[0;32mif\u001b[0m \u001b[0mfmt\u001b[0m \u001b[0;34m==\u001b[0m \u001b[0;34m'svg'\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/Users/jon/Envs/pbxtest/lib/python2.7/site-packages/matplotlib/backend_bases.pyc\u001b[0m in \u001b[0;36mprint_figure\u001b[0;34m(self, filename, dpi, facecolor, edgecolor, orientation, format, **kwargs)\u001b[0m\n\u001b[1;32m   2156\u001b[0m                     \u001b[0morientation\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0morientation\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   2157\u001b[0m                     \u001b[0mdryrun\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mTrue\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 2158\u001b[0;31m                     **kwargs)\n\u001b[0m\u001b[1;32m   2159\u001b[0m                 \u001b[0mrenderer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfigure\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_cachedRenderer\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   2160\u001b[0m                 \u001b[0mbbox_inches\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfigure\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_tightbbox\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mrenderer\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/Users/jon/Envs/pbxtest/lib/python2.7/site-packages/matplotlib/backends/backend_agg.pyc\u001b[0m in \u001b[0;36mprint_png\u001b[0;34m(self, filename_or_obj, *args, **kwargs)\u001b[0m\n\u001b[1;32m    531\u001b[0m             _png.write_png(renderer._renderer.buffer_rgba(),\n\u001b[1;32m    532\u001b[0m                            \u001b[0mrenderer\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mwidth\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mrenderer\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mheight\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 533\u001b[0;31m                            filename_or_obj, self.figure.dpi)\n\u001b[0m\u001b[1;32m    534\u001b[0m         \u001b[0;32mfinally\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    535\u001b[0m             \u001b[0;32mif\u001b[0m \u001b[0mclose\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mRuntimeError\u001b[0m: Could not create write struct"
     ]
    },
    {
     "data": {
      "text/plain": [
       "<matplotlib.figure.Figure at 0x10ba96050>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "im = plt.imshow(count_matrix, interpolation='none', aspect='auto')\n",
    "plt.colorbar(im)\n",
    "plt.xlabel('Position')\n",
    "plt.ylabel('Block')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "PBxplore provides the :func:`pbxplore.analysis.plot_map` function to ease the visualization of the occurence matrix."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "pbx.analysis.plot_map('map.png', count_matrix)\n",
    "!rm map.png"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The :func:`pbxplore.analysis.plot_map` helper has a ``residue_min`` and a ``residue_max`` optional arguments to display only part of the matrix. These two arguments can be pass to all PBxplore functions that produce a figure."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "pbx.analysis.plot_map('map.png', count_matrix,\n",
    "                      residue_min=60, residue_max=70)\n",
    "!rm map.png"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that matrix in the the figure produced by :func:`pbxplore.analysis.plot_map` is normalized so as the sum of each column is 1. The matrix can be normalized with the :func:`pbxplore.analysis.compute_freq_matrix`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "freq_matrix = pbx.analysis.compute_freq_matrix(count_matrix)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "im = plt.imshow(freq_matrix, interpolation='none', aspect='auto')\n",
    "plt.colorbar(im)\n",
    "plt.xlabel('Position')\n",
    "plt.ylabel('Block')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Protein Block entropy\n",
    "\n",
    "The $N_{eq}$ is a measure of variability based on the count matrix calculated above. It can be computed with the :func:`pbxplore.analysis.compute_neq` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "neq_by_position = pbx.analysis.compute_neq(count_matrix)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "``neq_by_position`` is a 1D numpy array with the $N_{eq}$ for each residue."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plt.plot(neq_by_position)\n",
    "plt.xlabel('Position')\n",
    "plt.ylabel('$N_{eq}$')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The :func:`pbxplore.analysis.plot_neq` helper ease the plotting of the $N_{eq}$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "pbx.analysis.plot_neq('neq.png', neq_by_position)\n",
    "!rm neq.png"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The ``residue_min`` and ``residue_max`` arguments are available."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "pbx.analysis.plot_neq('neq.png', neq_by_position,\n",
    "                      residue_min=60, residue_max=70)\n",
    "!rm neq.png"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Display PB variability as a logo"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "pbx.analysis.generate_weblogo('logo.png', count_matrix)\n",
    "display(Image('logo.png'))\n",
    "!rm logo.png"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "pbx.analysis.generate_weblogo('logo.png', count_matrix,\n",
    "                              residue_min=60, residue_max=70)\n",
    "display(Image('logo.png'))\n",
    "!rm logo.png"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
