Promiscuous ligand-receptor binding¶
In this case study, we demonstrate how to use EQTK and be used to efficiently investigate promiscuous ligand-receptor binding. We use the term “promiscuous” because more than one ligand may bind a given receptor. Antebi, et al. (Cell, 2017) studied this system in their paper on combinatorial signaling in the bone morphogenetic protein signaling pathway.
We will specifically study what Antebi and coworkers refer to as the (2,2,2) system, in which each of two ligands may sequentially bind to one of two type A receptors and one of two type B receptors.
To start the analysis, we import Numpy and EQTK, and additionally HoloViews for plotting, also setting some nice defaults for the plots. We will also import Numba, which we will use to just-in-time compile our code.
[1]:
import numpy as np
import pandas as pd
import tqdm
import eqtk
import holoviews as hv
hv.extension('bokeh')
WARNING:param.main: pandas could not register all extension types imports failed with the following error: cannot import name 'ABCIndexClass' from 'pandas.core.dtypes.generic' (/Users/bois/opt/anaconda3/lib/python3.8/site-packages/pandas/core/dtypes/generic.py)
The chemical reactions¶
For the (2, 2, 2) system, we consider two ligands, L₁ and L₂, two receptors of type A, A₁ and A₂, and two receptors of type B, B₁ and B₂. We could choose one set of chemical reactions,
\begin{align} &\mathrm{D}_{ij} \rightleftharpoons \mathrm{A}_i + \mathrm{L}_j \\[1em] &\mathrm{T}_{ijk} \rightleftharpoons \mathrm{D}_{ij} + \mathrm{B}_k, \end{align}
for all combinations of \((i, j, k) \in (1, 2)\). Here, \(\mathrm{D}_{ij}\) denotes a dimer \(\mathrm{A}_i\cdot\mathrm{L}_j\) and \(\mathrm{T}_{ijk}\) denotes a trimer \(\mathrm{A}_i\cdot\mathrm{L}_j\cdot\mathrm{B}_k\). Note that we have stipulated than the ligand can only bind a type B receptor if it is already bound to a type A receptor, as did Antebi, et al. The concentration of the trimeric complex, \(\mathrm{T}_{ijk}\) determines the strength of the signaling due to the ligand-receptor binding. The signal strength \(S\) is given by
\begin{align} S = \sum_{ijk} \epsilon_{ijk}\,[\mathrm{T}_{ijk}], \end{align}
where \([\mathrm{T}_{ijk}]\) is the concentration of \(\mathrm{T}_{ijk}\).
To mimic the situation in cell culture, in which cells sit at the bottom of a sea of solution containing ligands, we hold the concentration of ligands constant. We will therefore solve for the equilibrium concentrations using eqtk.fixed_value_solve()
.
To compute all of the possible the chemical reactions, it is convenient to write a function to generate the reactions as strings.
[2]:
def make_rxns(nA, nB, nL):
rxns = ""
# Dimer rxns
for j in range(nL):
for i in range(nA):
rxns += f"A_{i+1} + L_{j+1} <=> D_{i+1}_{j+1}\n"
# Trimer rxns
for j in range(nL):
for i in range(nA):
for k in range(nB):
rxns += f"D_{i+1}_{j+1} + B_{k+1} <=> T_{i+1}_{j+1}_{k+1}\n"
return rxns
Using this function for the (2, 2, 2) system gives the following result.
[3]:
nA = nB = nL = 2
print(make_rxns(nA, nB, nL))
A_1 + L_1 <=> D_1_1
A_2 + L_1 <=> D_2_1
A_1 + L_2 <=> D_1_2
A_2 + L_2 <=> D_2_2
D_1_1 + B_1 <=> T_1_1_1
D_1_1 + B_2 <=> T_1_1_2
D_2_1 + B_1 <=> T_2_1_1
D_2_1 + B_2 <=> T_2_1_2
D_1_2 + B_1 <=> T_1_2_1
D_1_2 + B_2 <=> T_1_2_2
D_2_2 + B_1 <=> T_2_2_1
D_2_2 + B_2 <=> T_2_2_2
These reactions are in the appropriate format for EQTK’s reaction parser to generate a stoichiometric matrix \(\mathsf{N}\) from the reactions. The eqtk.parse_rxns()
function places the stoichiometric matrix in a data frame with the ordering of the columns given by the order in which the chemical species appear in the string containing the reactions. Because we will be doing thousands and thousands of solves as we explore parameter space, we will be using \(\mathsf{N}\) as a Numpy
arrays to save on the computational cost of creating and manipulating data frames in eqtk.solve()
’s I/O. So, we should make \(\mathsf{N}\) as a Numpy array where the first columns represent receptors of type A, the next columns represent receptors of type B, the next ligands, then dimers, and finally trimers. We can code this up in a function to make our stoichiometric matrix.
[4]:
def make_N(nA, nB, nL):
rxns = make_rxns(nA, nB, nL)
N = eqtk.parse_rxns(rxns)
# Sorted names
names = sorted(N.columns, key=lambda s: (len(s), s))
# Sorted columns
N = N[names]
# As a Numpy array
return N.to_numpy(copy=True, dtype=float)
We can now build our stoichiometric matrix.
[5]:
N = make_N(nA, nB, nL)
# Take a look
print(make_N(2, 2, 2))
[[-1. 0. 0. 0. -1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. -1. 0. 0. -1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[-1. 0. 0. 0. 0. -1. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. -1. 0. 0. 0. -1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. -1. 0. 0. 0. -1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. -1. 0. 0. -1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[ 0. 0. -1. 0. 0. 0. 0. 0. -1. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[ 0. 0. 0. -1. 0. 0. 0. 0. -1. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
[ 0. 0. -1. 0. 0. 0. 0. -1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 0. 0. -1. 0. 0. 0. -1. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[ 0. 0. -1. 0. 0. 0. 0. 0. 0. -1. 0. 0. 0. 0. 0. 0. 1. 0.]
[ 0. 0. 0. -1. 0. 0. 0. 0. 0. -1. 0. 0. 0. 0. 0. 0. 0. 1.]]
The equilibrium coefficients¶
We need to specify the equilibrium constants for dimer formation and for trimer formation. We will perform an exploration of parameter space similar to that of Antebi, et al. They chose dimensionless units such that the following hold:
Defining \(K_{ij}\) as the equilibrium constant for the formation of \(\mathrm{D}_{ij}\), \(\sum_{i}K_{ij} = 1\) for each ligand \(j\).
Defining \(K_{ijk}\) as the equilibrium constant for the formation of \(\mathrm{T}_{ijk}\), \(\sum_{ijk}K_{ijk} = 1\).
The equilibrium constants were then chosen out uniform distributions such that the above sum-to-one constraints are enforced. This is accomplished by choosing the equilibrium constants out of a Dirichlet distribution with all parameters set to one.
[6]:
def make_K(nA, nB, nL):
# Dimerization equilibrium constants
K_dimer = np.concatenate([np.random.dirichlet(np.ones(nA)) for i in range(nL)])
# Trimerization equilibrium constants
K_trimer = np.random.dirichlet(np.ones(nA*nB*nL))
return np.concatenate((K_dimer, K_trimer))
Let’s compute a set of equilibrium constants so we can see how the array looks.
[7]:
# Seed for reproducibility
np.random.seed(239486234)
K = make_K(nA, nB, nL)
K
[7]:
array([0.6957962 , 0.3042038 , 0.43988853, 0.56011147, 0.01418418,
0.18754634, 0.11632898, 0.21491361, 0.08621283, 0.21522578,
0.14238907, 0.02319921])
In this case, the first four entries are for dimer formation and the last eight are for trimer formation.
The initial concentrations¶
In exploring parameter space for the (2, 2, 2) system, Antebi, et al. varied the concentration of each of the two ligands over six orders of magnitude, from \(10^{-3}\) to \(10^3\) in dimensionless units. As they did, we will use 15 concentrations of each ligand over that interval.
For each calculation, Antebi and coworkers chose random concentrations for each receptor from the log-uniform distribution on \([10^{-3}, 10^3]\).
We can write a function to make a set of initial concentrations satisfying these requirements. We need to hold these ligand concentrations constant for the calculation, so we also need to generate a fixed_c
array, which has nonnegative concentrations for the ligand monomers and NaN’s for all other species.
[8]:
def make_c0_grid(nA, nB, nL, n):
# Ligand concentrations
cL0 = np.logspace(-3, 3, n)
cL0 = np.meshgrid(*tuple([cL0]*nL))
# Initialize c0 and fixed_c
c0 = np.zeros((n**nL, nA + nB + nL + nA*nL + nA*nB*nL))
fixed_c = np.zeros_like(c0) * np.nan
# Add ligand concentrations
for i in range(nL):
c0[:, i+nA+nB] = cL0[i].flatten()
fixed_c[:, i+nA+nB] = cL0[i].flatten()
# Random concentrations of receptors
for i in range(nA):
c0[:, i] = 10**np.random.uniform(-3, 3)
for i in range(nB):
c0[:, i + nA] = 10**np.random.uniform(-3, 3)
return c0, fixed_c
Let’s generate c0
and fixed_c
arrays and take a look at their shape.
[9]:
n = 15
c0, fixed_c = make_c0_grid(nA, nB, nL, n)
c0.shape
[9]:
(225, 18)
There are 225 different sets of initial concentrations we solve for.
The readout¶
The concentrations of the respective species are “read out” by the intensity of the intracellular signaling triggered by the ligand-receptor binding. The signal strength \(S\) is given by
\begin{align} S = \sum_{ijk} \epsilon_{ijk}\,[\mathrm{T}_{ijk}]. \end{align}
We can write a function to compute this from the parameters \(\epsilon_{ijk}\) and concentrations returned by eqtk.solve()
.
[10]:
def readout(epsilon, c):
return np.dot(epsilon, c[:, -len(epsilon):].transpose())
The choice of \(\epsilon_{ijk}\) is drawn out of a uniform distribution subject to the contraint that \(\sum_{ijk} \epsilon_{ijk} = 1\). We can again accomplish this by drawing out of a Dirichlet distribution.
[11]:
def make_epsilon(nA, nB, nL):
return np.random.dirichlet(np.ones(nA*nB*nL))
epsilon = make_epsilon(nA, nB, nL)
Solve!¶
We now have all the ingredients to solve for the concentrations and compute the readout.
[12]:
c = eqtk.fixed_value_solve(c0=c0, fixed_c=fixed_c, N=N, K=K)
s = readout(epsilon, c)
We can make a heat map of the readout as a function of ligand concentration, bearing in mind that the initial ligand concentrations are in columns 4 and 5 of c0
.
[13]:
hv.HeatMap(
data=(c0[:,4], c0[:,5], s),
kdims=['L1', 'L2'],
vdims=['S']
).opts(
logx=True,
logy=True,
logz=True,
cmap='viridis',
frame_height=200,
frame_width=200,
toolbar='above',
xrotation=45,
colorbar=True,
)
[13]:
Rapid characterization of signaling behavior¶
Antebi et al. defined two summary parameters that can be computed from the above heat map. They considered only the regions of high ligand concentrations; that is the top row and rightmost column of the above heat map. They defined the following quantities:
variable |
description |
---|---|
\(a\) |
Signaling level of weaker ligand in absence of stronger ligand |
\(b\) |
Signaling level of stronger ligand in absence of weaker ligand |
\(c\) |
Maximum signal in the high-ligand concentration region of the heat map |
\(d\) |
Minimum signal in the high-ligand concentration region of the heat map |
In the heat map above, L1 is the weaker ligand because at high ligand concentration, it has lower signaling.
From these parameters, they defined the relative ligand strength, \(\mathrm{RLS} = a/b\), which ranges from zero to one. They also defined the ligand interference coefficient, \(\mathrm{LIC} = d/a - b/c\).
We will randomly select equilibrium constants, receptor concentrations, and readout magnitudes \(\epsilon_{ijk}\) and compute the RLS and LIC. We can then make a plot of LIC vs. RLS.
To do this, we only need to compute equilibria in the high ligand concentration regime. So, let’s write another function that will generate c0
only for that regime.
[14]:
def make_c0_high_ligand(nA, nB, nL, n):
if nL != 2:
raise ValueError("Only defined for the two-ligand problem.")
# Initialize c0 and fixed_c
c0 = np.zeros((2 * n, nA + nB + nL + nA * nL + nA * nB * nL))
fixed_c = np.zeros_like(c0) * np.nan
# Ligand concentrations
cL10 = np.concatenate([[0], np.logspace(-3, 3, n), [1e3] * (n - 1)])
cL20 = np.concatenate([[1e3] * (n - 1), np.logspace(3, -3, n), [0]])
c0[:, nA + nB] = cL10
c0[:, nA + nB + 1] = cL20
fixed_c[:, nA + nB] = cL10
fixed_c[:, nA + nB + 1] = cL20
# Cannot fix a concentration to zero, so change fixed_c=0 values
fixed_c[np.where(fixed_c == 0)] = np.nan
# Random concentrations of receptors
for i in range(nA):
c0[:, i] = 10 ** np.random.uniform(-3, 3)
for i in range(nB):
c0[:, i + nA] = 10 ** np.random.uniform(-3, 3)
return c0, fixed_c
Let’s look at how the ligand concentrations are represented in c0
.
[15]:
c0, fixed_c = make_c0_high_ligand(nA, nB, nL, n)
c0[:,4:6]
[15]:
array([[0.00000000e+00, 1.00000000e+03],
[1.00000000e-03, 1.00000000e+03],
[2.68269580e-03, 1.00000000e+03],
[7.19685673e-03, 1.00000000e+03],
[1.93069773e-02, 1.00000000e+03],
[5.17947468e-02, 1.00000000e+03],
[1.38949549e-01, 1.00000000e+03],
[3.72759372e-01, 1.00000000e+03],
[1.00000000e+00, 1.00000000e+03],
[2.68269580e+00, 1.00000000e+03],
[7.19685673e+00, 1.00000000e+03],
[1.93069773e+01, 1.00000000e+03],
[5.17947468e+01, 1.00000000e+03],
[1.38949549e+02, 1.00000000e+03],
[3.72759372e+02, 1.00000000e+03],
[1.00000000e+03, 3.72759372e+02],
[1.00000000e+03, 1.38949549e+02],
[1.00000000e+03, 5.17947468e+01],
[1.00000000e+03, 1.93069773e+01],
[1.00000000e+03, 7.19685673e+00],
[1.00000000e+03, 2.68269580e+00],
[1.00000000e+03, 1.00000000e+00],
[1.00000000e+03, 3.72759372e-01],
[1.00000000e+03, 1.38949549e-01],
[1.00000000e+03, 5.17947468e-02],
[1.00000000e+03, 1.93069773e-02],
[1.00000000e+03, 7.19685673e-03],
[1.00000000e+03, 2.68269580e-03],
[1.00000000e+03, 1.00000000e-03],
[1.00000000e+03, 0.00000000e+00]])
The concentration of L1 varies from 0 to 1000 while L2 is held fixed at 1000. Then, L2 varies from 1000 to 0 while L1 is held fixed at 1000.
Finally, we need a function to compute the LIC and RLS. Knowing the structure of the c0
arrays helps in this task.
[16]:
def lic_rls(s, n):
a = s[0]
b = s[-1]
c = np.max(s)
d = np.min(s)
# Ensure a is the low level.
if a > b:
a, b = b, a
lic = d / a - b / c
rls = a / b
return lic, rls
We are now ready to solve for the LIC and RLS for many random parameter sets. Running the cell below will take a couple of minutes.
[17]:
n_sets = 10000
rls = np.empty(n_sets)
lic = np.empty(n_sets)
# List to store parameters
parameters = [None for _ in range(n_sets)]
for i in tqdm.tqdm(range(n_sets)):
c0, fixed_c = make_c0_high_ligand(nA, nB, nL, n)
K = make_K(nA, nB, nL)
epsilon = make_epsilon(nA, nB, nL)
parameters[i] = dict(receptor_conc=c0[0,:4], K=K, epsilon=epsilon)
c = eqtk.fixed_value_solve(c0=c0, fixed_c=fixed_c, N=N, K=K)
s = readout(epsilon, c)
lic[i], rls[i] = lic_rls(s, n)
100%|██████████| 10000/10000 [02:01<00:00, 82.41it/s]
Let’s plot the results!
[18]:
hv.Points(
data=(lic, rls, np.arange(n_sets)),
kdims=["ligand interference coefficient", "relative ligand strength"],
vdims=["index"]
).opts(
color="black",
fill_alpha=0,
height=250,
line_alpha=0.3,
show_grid=True,
size=3,
toolbar="above",
tools=["hover"],
xlim=(-1, 1),
ylim=(0, 1),
width=450,
)
[18]:
Since we stored the parameters in a list of dictionaries, we can easily hover over a point on the RLS vs. LIC plot and make a full grid plot with those parameters. For example, index 1693 (upper right-most point) might be interesting.
[19]:
i = 1693
K = parameters[i]["K"]
c0, fixed_c = make_c0_grid(nA, nB, nL, 15)
c0[:,:4] = parameters[i]["receptor_conc"]
epsilon = parameters[i]["epsilon"]
c = eqtk.fixed_value_solve(c0=c0, fixed_c=fixed_c, N=N, K=K)
s = readout(epsilon, c)
hv.HeatMap(
data=(c0[:,4], c0[:,5], s),
kdims=['L1', 'L2'],
vdims=['S']
).opts(
logx=True,
logy=True,
logz=True,
cmap='viridis',
frame_height=200,
frame_width=200,
toolbar='above',
xrotation=45,
colorbar=True,
)
[19]:
This is a “balance” condition, where high signaling occurs if L1 and L2 are sufficiently high and of similar magnitude.