Source code for teilab.pcr

# coding: utf-8
r"""
A **real-time polymerase chain reaction (real-time PCR)** is a laboratory technique of molecular biology based on the **polymerase chain reaction (PCR)**. It monitors the amplification of a targeted DNA molecule **during** the PCR (i.e., in real time), **not at its end**, as in conventional PCR. Real-time PCR can be used quantitatively (quantitative real-time PCR) and semi-quantitatively (i.e., above/below a certain amount of DNA molecules) (semi-quantitative real-time PCR).

############
Instructions
############

There are two basic quantification methods in real-time PCR, :ref:`"Absolute Quantification" <target to Absolute Quantification section>` and :ref:`"Relative Quantification" <target to Relative Quantification section>` , but the underlying ideas are the same. As the template DNA increases :math:`2 (1 + e (\text{amplification efficiency}))` times in one cycle of PCR, you can calculate the initial amount of DNA by detecting how many cycles of PCR were performed to reach a certain amount of DNA.

.. _target to Absolute Quantification section:

***********************
Absolute Quantification
***********************

The standard curve method for **absolute quantification** is similar to that for **relative quantification**, except the absolute quantities of the standards must first be known by some independent means.

Create several dilution series from a known concentration target template and create an **accurate** calibration curve that associates each target concentration with the :math:`Ct` value.

From this calibration curve, we can determine the actual copy number of the target DNAs using :math:`Ct` value.

.. _target to Relative Quantification section:

***********************
Relative Quantification
***********************

A method to focus on the **relative ratio** of a calibrator (control) sample and the gene of interest (GOI). The following various calibration algorithms have been proposed.

1. :ref:`ΔCt method <target to delta Ct method>`
2. :ref:`ΔΔCt method <target to delta delta Ct method>`
3. :ref:`Calibration curve method <target to Calibration curve method>`

.. _target to delta Ct method:

1. ΔCt method
==============

.. math::
    \text{Expression Ratio} = 2^{-\Delta Ct_{GOI}}\quad\left(GOI: Gene of Interest.\right)

.. math::
    \begin{cases}
        \Delta Ct_{GOI}&=Ct_{GOI}^{sample}-Ct_{GOI}^{cont.},\\
        ratio &= \left(\displaystyle\frac{[DNA]_0^{sample}}{[DNA]_0^{cont.}}\right)^{\displaystyle\frac{1}{\log_2(1+e)}}.
    \end{cases}

.. _target to delta delta Ct method:

2. ΔΔCt method
===============

.. math::
    \text{Expression Ratio} = 2^{-\Delta\Delta Ct_{GOI}}.

It is necessary that the amplification efficiencies of REF (Reference Gene) and GOI (Gene of Interest) are the same, but this assumption is generally not met.

.. math::
    \Delta\Delta Ct = \Delta Ct_{GOI} - \Delta Ct_{REF}.

.. _target to Calibration curve method:

3. Calibration curve method
===========================

.. math::
    \text{Expression Ratio} = \left(E_{GOI}^{-\Delta Ct_{GOI}}\right) / \left(E_{REF}^{-\Delta Ct_{REF}}\right)

A calibration curve is drawn to determine the amplification efficiency of **both REF and GOI**, and this is used to standardize the difference in :math:`Ct` of GOI contained in the target sample and the control sample with respect to REF. The amount of REF contained in the sample must be equal for all samples.

.. math::
    E =  10^{[-1/\text{slope}]}\left(=1+e\right)\quad(\text{efficiency from calibration curve.})

Let Amount (concentration) of initial template DNA be :math:`[DNA]_0`, amplification efficiency be :math:`e`, and Number of PCR cycles be :math:`C`.

The relationship between the amount of DNA (:math:`[DNA]`) in the PCR product amplified by PCR and the number of cycles (:math:`C`) is shown below:

.. math::
    [DNA] &= [DNA]_0\left(1 + e\right)^C\\ C &= \frac{\log[DNA]-\log[DNA]_0}{\log(1+e)}

Therefore, let the number of cycles :math:`C` that reaches the **threshold value** be :math:`Ct`, and the DNA concentration at that time is :math:`[DNA]_t`, the logarithm initial concentration is proportional to :math:`Ct` when the primers (:math:`\fallingdotseq e`) are the same, so this relationship can be represented by a calibration curve (**linear line**), and its slope (:math:`\text{slope}`) is shown below:

.. math::
    \text{slope}
    &= \frac{Ct^1-Ct^2}{\log[DNA]_0^1 - \log[DNA]_0^2}\\
    &= \frac{\left(\log[DNA]_0^2 - \log[DNA]_0^1\right) / \log(1+e)}{\log[DNA]_0^1 - \log[DNA]_0^2} \\
    &= \frac{-1}{\log\left(1+e\right)}\\
    \therefore e &= 10^{\frac{-1}{\text{slope}}} - 1

Here, let :math:`E = 10^{\frac{-1}{slope}} = 1+e`, it can be written as follows

.. math::
    E_{GOI}^{-\Delta Ct_{GOI}}
    &= \left(1 + e_{GOI}\right)^{\displaystyle\frac{\log[DNA]_{0,GOI}^{sample}-\log[DNA]_{0,GOI}^{cont.}}{\log\left(1+e_{GOI}\right)}}\\
    &= \frac{[DNA]_{0,GOI}^{sample}}{[DNA]_{0,GOI}^{cont.}}

so, the expression level of the calibrator (control) sample and the target sample can be compared in a **primer-independent manner**.

Therefore, by comparing this value with REF (ex. Housekeeping gene), it is possible to correctly compare the amount of DNA between different samples.

##############
Python Objects
##############
"""
from typing import Any, List, Optional, Tuple

import numpy as np
from matplotlib.axes import Axes
from nptyping import NDArray

from .plot.matplotlib import update_layout
from .utils.math_utils import optimize_linear
from .utils.plot_utils import get_colorList, subplots_create


[docs]def calibration_curve_plot( qualities: NDArray[Any, float], cts: NDArray[(Any, Any, Any), float], target: str = "", color: str = "#c94663", ecolor: str = "black", ax: Optional[Axes] = None, **kwargs, ) -> Tuple[Axes, float]: """Plot a calibration curve. Args: qualities (NDArray[Any,float]) : Concentration of dilution series. shape=(n_qualities) cts (NDArray[(Any,Any),float]) : Ct values for calibration. shape=(n_qualities, n_trials_calibrations) target (str, optional) : The calibration target. Defaults to ``"Calibration curve"``. color (str, optional) : The color of plot. Defaults to ``"#c94663"``. ecolor (str, optional) : The color of Error Bar. Defaults to ``"#c94663"``. ax (Optional[Axes], optional) : An instance of ``Axes``. Defaults to ``None``. Returns: Tuple[Axes,float]: An instance of ``Axes`` with calibration curve, and amplification efficiency. Raises: ValueError: ``cts.shape[0]`` is not the same as ``len(qualities)``. .. plot:: :include-source: :class: popup-img >>> import numpy as np >>> from teilab.pcr import calibration_curve_plot >>> from teilab.utils import subplots_create >>> qualities = np.asarray([2, 1, 0.1]) >>> calibrations = np.asarray([ ... [[21.9904747, 21.94359016], [23.2647419, 23.24508476], [27.37600517, 27.46136856]], ... [[26.68031502, 26.75434494], [28.25722122, 28.239748 ], [31.77442169, 32.42930222]] >>> ]) >>> targets = ["Gapdh", "VIM"] >>> fig, axes = subplots_create(ncols=2, style="matplotlib", figsize=(12,4)) >>> for ax,cts,target in zip(axes, calibrations, targets): ... _ = calibration_curve_plot(qualities, cts, ax=ax, target=target) >>> fig.show() """ n_qualities, n_trials_calibrations = cts.shape if len(qualities) != n_qualities: raise TypeError( f"cts.shape[0] must be the same as the length of qualities, but got ({cts.sape[0]}!={len(qualities)})" ) log_qualities = np.log10(qualities) cts_mean = np.mean(cts.reshape(n_qualities, -1), axis=1) cts_std = np.std(cts.reshape(n_qualities, -1), axis=1) slope, intercept, func = optimize_linear(X=log_qualities, Y=cts_mean) e = 10 ** (-1 / slope) - 1 ax = ax or subplots_create(ncols=1, nrows=1, style="matplotlib")[1] X = np.linspace(np.min(log_qualities), np.max(log_qualities), 1000) ax.plot(X, func(X), color=color) ax.errorbar(x=log_qualities, y=cts_mean, yerr=cts_std, capsize=5, fmt="o", ecolor=ecolor, color=color) ax = update_layout( ax=ax, title=f"Calibration curve for '{target}' (e={e:.3f})", xlabel="Relative initial concentration", ylabel="Cts", **kwargs, ) ax.set_xticks(log_qualities) ax.set_xticklabels(qualities) return (ax, e)
[docs]def expression_ratio_plot( GOI_Cts: NDArray[(Any, Any), float], REF_Cts: NDArray[(Any, Any), float], e_GOI: float, e_REF: float, labels: List[str], name_GOI: str = "", name_REF: str = "", color: str = "#c94663", ax: Optional[Axes] = None, **kwargs, ) -> Axes: r"""Plot to compare Expression Ratios. Args: GOI_Cts, REF_Cts (NDArray[(Any,Any),float]) : Threshold Cycles of GOI or REF. shape=(n_samples, n_trials_Cts) e_GOI, e_REF (float) : Efficiency of GOI or REF primer. labels (List[str]) : [description] name_GOI, name_REF (str, optional) : The name of GOI or REF. Defaults to ``""``. color (str, optional) : The color of plot. Defaults to ``"#c94663"``. ax (Optional[Axes], optional) : An instance of ``Axes``. Defaults to ``None``. Returns: Axes: An instance of ``Axes`` with expression ratio plots. .. plot:: :include-source: :class: popup-img >>> import numpy as np >>> import matplotlib.pyplot as plt >>> from matplotlib.gridspec import GridSpec >>> from teilab.pcr import expression_ratio_plot, calibration_curve_plot >>> fig = plt.figure(constrained_layout=True, figsize=(14,6)) >>> gs = GridSpec(nrows=2, ncols=4, figure=fig) >>> targets = ["Gapdh", "VIM"] >>> qualities = np.asarray([2, 1, 0.1]) >>> calibration_cts = np.asarray([ ... [[21.9904747, 21.94359016], [23.2647419, 23.24508476], [27.37600517, 27.46136856]], ... [[26.68031502, 26.75434494], [28.25722122, 28.239748 ], [31.77442169, 32.42930222]] >>> ]) >>> n_targets,n_qualities,n_trials_calibrations = calibration_cts.shape >>> print( ... "[Calibration]", ... f"{n_targets}_targets: {targets}", ... f"{n_qualities}_targets: {qualities}", ... f"n_trials : {n_trials_calibrations}", ... sep="\n" >>> ) >>> efficiencies = [] >>> for i,cts in enumerate(calibration_cts): ... ax, e = calibration_curve_plot(qualities, cts, ax=fig.add_subplot(gs[i,3]), target=targets[i]) ... efficiencies.append(e) >>> samples = ["mock(1)", "mock(5)", "unmodified", "2OMe3", "2OMe5", "2OMe7", "LNA3", "LNA7"] >>> Cts = np.asarray([ ... [[23.2647419, 23.24508476], [20.76102257, 20.77914238], [19.40455055, 19.52949905], [19.70094872, 19.60042572], [19.41954041, 19.13051605], [24.17935753, 21.98130798], [20.01245308, 20.02809715], [21.2081356, 20.1692791]], ... [[28.25722122, 28.239748], [25.16436958, 25.28390503], [24.71133995, 24.70510483], [25.37249184, 25.47054863], [24.72605515, 24.43961525], [27.91354942, 27.93320656], [26.08522797, 26.0483017], [24.96000481, 25.04871941]] >>> ]) >>> _,n_samples,n_trials_Cts = Cts.shape >>> print( ... "[Normalized Expression Ratio]", ... f"{n_samples}_samples: {samples}", ... f"n_trials : {n_trials_Cts}", ... sep="\n" >>> ) >>> ax = expression_ratio_plot(GOI_Cts=Cts[1], REF_Cts=Cts[0], e_GOI=efficiencies[1], e_REF=efficiencies[0], name_GOI="VIM", name_REF="Gapdh", labels=samples, ax=fig.add_subplot(gs[:,:3])) >>> fig.show() """ n_samples, n_trials_Cts = GOI_Cts.shape if len(labels) != n_samples: labels = [f"No.{i}" for i in range(n_samples)] GOI_mean = np.mean(GOI_Cts.reshape(n_samples, -1), axis=1) REF_mean = np.mean(REF_Cts.reshape(n_samples, -1), axis=1) ratios = ((e_GOI + 1) ** (-GOI_mean)) / ((e_REF + 1) ** (-REF_mean)) ax = ( ax or subplots_create(ncols=1, nrows=1, style="matplotlib", figsize=(int(n_samples * 1.6), int(n_samples * 0.8)))[ 1 ] ) ax.bar(labels, ratios, color=color) update_layout( ax=ax, title=f"'{name_GOI}' Expression (Normalized by '{name_REF}')", xlabel="samples", ylabel="Relative Expression", **kwargs, ) ax.grid() return ax