2022-08-21 • Area under STA (new conntest)

Imports

#
using Revise
using MyToolbox
using VoltoMapSim
[ Info: Precompiling VoltoMapSim [f713100b-c48c-421a-b480-5fcb4c589a9e]

Params

Based on Roxin; same as previous nb’s.

d = 6
p = get_params(
    duration = 10minutes,
    p_conn = 0.04,
    g_EE = 1   / d,
    g_EI = 18  / d,
    g_IE = 36  / d,
    g_II = 31  / d,
    ext_current = Normal(-0.5 * pA/√seconds, 5 * pA/√seconds),
    E_inh = -80 * mV,
    record_v = [1, 801],
);

Run sim

s = cached(sim, [p.sim]);
s = augment_simdata(s, p);

New connection test statistic

ii = s.input_info[1];
using PyPlot
[ Info: Precompiling Sciplotlib [61be95e5-9550-4d5f-a203-92a5acbc3116]
using VoltoMapSim.Plot
plotSTA(ii.v, ii.spiketrains.conn.inh[1], p);
../_images/2022-08-21__Area-under-STA_16_0.png
sta = calc_STA(ii.v, ii.spiketrains.conn.inh[1], p);
t = sum(sta .- sta[1]);

What are the units? Sum of: voltage * dt.
So units are volt·second. But I’d had to add the dt to every term. I can do afterwards.

dt = p.sim.general.Δt;
t * dt / (mV * ms)
-3.55

(Estimate from graph for how big this should be: approx 0.5 mV difference, times 40 ms or so. So 20 mV·ms. Sure). How much is above and below?

sig = (sta .- sta[1]) * dt / (mV*ms)
above = sum(sig[sig .> 0])
below = sum(sig[sig .< 0])
above, below
(9, -12.5)

Nice, so yes units seem right.

How to name this measure. It’s area under STA, referenced to STA at t=0. AUS? AUA? relsum?
Area over start.
Just area is quite nice.

Adding this to misc.jl:

area(STA) = sum(STA .- STA[1]);

Compare performance

..between test measures: existing (peak-to-peak) and the new.

p0 = p;
pn = (@set p.conntest.STA_test_statistic = "area");
cached_conntest_perf = VoltoMapSim.cached_conntest_perf;
perf0 = cached_conntest_perf(1, ii.v, ii.spiketrains, p0);
perf0.detection_rates
(TPR_exc = 0.154, TPR_inh = 1, FPR = 0.15)
perfn = cached_conntest_perf(1, ii.v, ii.spiketrains, pn);
Testing connections: 100%|██████████████████████████████| Time: 0:00:24
Saving output at `C:\Users\tfiers\.phdcache\datamodel v2 (net)\evaluate_conntest_perf\51a268c55a44be7a.jld2` … done (4.5 s)
perfn.detection_rates
(TPR_exc = 0, TPR_inh = 0.8, FPR = 0.175)

Aha! It seems significantly worse than peak-to-peak (for this sim, this neuron).

Thing left to do is, use this to determine whether a presynaptic neuron is excitatory or inhibitory (now we’re cheating and presupposing that knowledge in the conntest perf eval).

I think it’s gonna make the performance way worse.

First, let’s redo the test for the inhibitory recorded neuron.

ii_inh = s.input_info[801];
perf0 = cached_conntest_perf(801, ii_inh.v, ii_inh.spiketrains, p0);
Testing connections: 100%|██████████████████████████████| Time: 0:00:20
Saving output at `C:\Users\tfiers\.phdcache\datamodel v2 (net)\evaluate_conntest_perf\6a503ee8521e347c.jld2` … done
perf0.detection_rates
(TPR_exc = 0.714, TPR_inh = 0.9, FPR = 0.05)
perfn = cached_conntest_perf(801, ii_inh.v, ii_inh.spiketrains, pn);
Testing connections: 100%|██████████████████████████████| Time: 0:00:21
Saving output at `C:\Users\tfiers\.phdcache\datamodel v2 (net)\evaluate_conntest_perf\352d0f0ba159fa0b.jld2` … done
perfn.detection_rates
(TPR_exc = 0.381, TPR_inh = 0.7, FPR = 0.1)

Again, signifcantly worse performance.

Use area for deciding exc or inh

This was the original motivation for this measure.

Our current test_connection returns a p-value based on peak-to-peak. It only says whether it thinks the presynaptic neuron is connected, not what type it is. We’ll add the area for that:

function test_connection_and_type(v, spikes, p)
    pval = test_connection(v, spikes, p)
    dt = p.sim.general.Δt
    A = area(calc_STA(v, spikes, p)) * dt / (mV*ms)
    if pval  p.evaluation.α
        predicted_type = :unconn
    elseif A > 0
        predicted_type = :exc
    else
        predicted_type = :inh
    end
    return (; predicted_type, pval, area_over_start=A)
end;
test_connection_and_type(ii.v, ii.spiketrains.conn.inh[1], p)
(predicted_type = :inh, pval = 0.01, area_over_start = -3.55)

Performance optimization sidebar

@time test_connection_and_type(ii.v, ii.spiketrains.conn.inh[1], p)
  0.853824 seconds (1.35 k allocations: 22.599 MiB)
(predicted_type = :inh, pval = 0.01, area = -3.55)

Sidenote, shuffle connection test takes a while. Profiling shows that almost all time is spent in the calc_STA addition loop.

STA .+= @view VI_sig[a:b]

Doesn’t seem much more optimizable.

Wait, that broadcasting . is not necessary. Let’s see what perf is without.

@time test_connection_and_type(ii.v, ii.spiketrains.conn.inh[1], p)
  5.284591 seconds (720.96 k allocations: 5.469 GiB, 20.00% gc time)
(predicted_type = :inh, pval = 0.01, area = -3.55)

Huh, perf is worse with it, and way more allocations. Weird. Revert.

Trying with manual for loop (and @inbounds):

[..]

this made a type instability problem apparent (manual loop super slow. → @code_warntype. much red. problem was that Δt not inferred (cause simtype abstract). manually adding type made manual loop fast; and the prev, non manual loop also faster :))

Yes good, the new profile now also shows that the majority of time is spent in float:+, as it should be.

Now testing whether sim itself suffers from this same type instability problem..

# pshort = (@set p.sim.general.duration = 1*seconds);
# @code_warntype VoltoMapSim.init_sim(pshort.sim);
# state = VoltoMapSim.init_sim(pshort.sim);
#@code_warntype VoltoMapSim.step_sim!(state, pshort.sim, 1);

It doesn’t. Type known-ness (“stability”) seems good enough.


Back on topic

Now use the new test_connection_and_type in a reworked eval_conntest_perf function:

using DataFrames
# @cached key=[:m,:p] (
function evaluate_conntest_perf_v2(s, m, p)
    # s = augmented simdata
    # m = postsynaptic neuron ID
    @unpack N_tested_presyn, rngseed = p.evaluation;
    resetrng!(rngseed)
    function get_IDs_labels(IDs, label)
        N = min(length(IDs), N_tested_presyn)
        IDs_sample = sample(IDs, N, replace = false, ordered = true)
        return zip(IDs_sample, fill(label, N))
    end
    ii = s.input_info[m]
    IDs_labels = chain(
        get_IDs_labels(ii.exc_inputs, :exc),
        get_IDs_labels(ii.inh_inputs, :inh),
        get_IDs_labels(ii.unconnected_neurons, :unconn),
    )
    tested_neurons = DataFrame(
        input_neuron_ID = Int[],     # global ID
        real_type       = Symbol[],  # :unconn, :exc, :inh
        predicted_type  = Symbol[],  # idem
        pval            = Float64[],
        area_over_start = Float64[],
    )
    @showprogress (every = 400ms) "Testing connections: " (
    for (n, label) in collect(IDs_labels)  # `collect` necessary somehow
        test_result = test_connection_and_type(ii.v, s.spike_times[n], p)
        row = (input_neuron_ID = n, real_type = label, test_result...)
        push!(tested_neurons, Dict(pairs(row)))
    end)
    tn = tested_neurons
    det_rate(t) = count((tn.real_type .== t) .& (tn.predicted_type .== t)) / count(tn.real_type .== t)    
    detection_rates = (
        TPR_exc = det_rate(:exc),
        TPR_inh = det_rate(:inh),
        FPR = 1 - det_rate(:unconn),
    )
    return (; tested_neurons, detection_rates)
end
# end)

cached_eval(s, m, p) = cached(evaluate_conntest_perf_v2, [s, m, p], key = [m, p]);

Aside: @cached macro wish.
Also generates a $(funcname)_uncached. So here: evaluate_conntest_perf_uncached

# @profview evaluate_conntest_perf_v2(s, 1, p);
Testing connections: 100%|██████████████████████████████| Time: 0:00:32

^ Most time spent in the + and the setindex of the calc_STA loop. (A bit also in shuffle_ISIs). (Could speed up a bit maybe by having STA on the stack – prob using StaticArrays.jl – to save that setindex time. Though wouldn’t be major, as most time is still spent in float +). Another thing to try is @fastmath IEEE rules relaxing, to speed up the +.

testeval = cached_eval(s, 1, p);
Testing connections: 100%|██████████████████████████████| Time: 0:00:25
Saving output at `C:\Users\tfiers\.phdcache\datamodel v2 (net)\evaluate_conntest_perf_v2\b8c53748899be8e2.jld2` … done (6.9 s)
ENV["LINES"] = 100
100
testeval.detection_rates
(TPR_exc = 0, TPR_inh = 1, FPR = 0.125)

Comparison with before, when we didn’t predict the type (just connected or not) (from 2022-07-05__Network-conntest):

(TPR_exc = 0.154, TPR_inh = 1, FPR = 0.15)

The exc performance drops to zero. This is because all previously detected exc inputs are still detected, but classified as inhibitory, as seen in the below table.

The FPR rate is different because we now take a random sample of unconnected neurons to test (before, we took the first 40).

testeval.tested_neurons

76 rows × 5 columns

input_neuron_IDreal_typepredicted_typepvalarea_over_start
Int64SymbolSymbolFloat64Float64
1312excunconn0.5111.3
2565excunconn0.617.25
3681excunconn0.556.58
4597excinh0.02-37.9
5447excunconn0.681.42
6766excunconn0.46-7.55
7132excunconn0.54-20.9
8139excunconn0.91-22.8
9446excunconn0.34-27.5
10194excunconn0.37-31
11710excunconn0.63-38.1
12352excunconn0.11-19.9
1311excinh0.01-12.2
14418excinh0.03-21
15101excunconn0.88-1.2
16629excunconn0.369.04
17145excunconn0.47-27.5
18303excunconn0.05-20
19169excunconn0.681.38
2066excunconn0.72-21.2
2170excunconn0.94-0.564
22800excunconn0.72-39.6
23516excunconn0.33-28
24136excinh0.02-55.5
2533excunconn0.86-35.5
26337excunconn0.56-88.9
27988inhinh0.01-3.55
28894inhinh0.01-56.7
29902inhinh0.01-59
30928inhinh0.01-70.6
31914inhinh0.01-52.1
32831inhinh0.01-7.06
33897inhinh0.01-72
34829inhinh0.01-29.4
35908inhinh0.01-63.8
36922inhinh0.01-33.7
3723unconnunconn0.94-4.49
3825unconnunconn0.7514.4
3986unconnunconn0.11-35.3
40113unconninh0.03-33.1
41197unconnunconn0.52-11.8
42227unconnunconn0.84-9.01
43230unconnunconn0.36-1.34
44262unconnunconn0.45-5.2
45269unconnunconn0.23-32.9
46323unconnunconn0.08-10.5
47332unconnunconn0.14-15.5
48367unconnunconn0.96-18.6
49394unconnunconn0.3-10.7
50410unconnunconn0.371.49
51424unconnunconn0.12-31.5
52439unconnunconn0.78-39.4
53460unconnunconn0.17-2.35
54487unconnunconn0.881.33
55499unconnunconn0.43-13.4
56521unconnunconn0.851.62
57537unconnunconn0.14-20.3
58547unconnunconn0.23-35.3
59599unconninh0.02-48.4
60612unconnunconn0.22-19.7
61646unconnunconn0.58-0.319
62669unconnunconn0.73-4.47
63702unconnunconn0.3-26.1
64722unconnunconn0.11-1.35
65768unconnunconn0.08-10.7
66790unconnunconn0.18-24.8
67813unconnunconn0.141.73
68821unconnunconn0.44-0.367
69842unconnexc0.032.01
70843unconnunconn0.23-10.6
71875unconninh0.01-7.61
72882unconnunconn0.18-15.8
73896unconnunconn0.58-19.3
74921unconnexc0.015.07
75956unconnunconn0.06-18.9
76977unconnunconn0.49-10.1

Inhibitory postsynaptic neuron

Now test input connections to the recorded inhibitory neuron:

te_inh = cached_eval(s, 801, p);
Testing connections: 100%|██████████████████████████████| Time: 0:00:21
Saving output at `C:\Users\tfiers\.phdcache\datamodel v2 (net)\evaluate_conntest_perf_v2\e828a750d5830742.jld2` … done
te_inh.detection_rates
(TPR_exc = 0.619, TPR_inh = 0.8, FPR = 0.075)

Comparison with before, as above:

(TPR_exc = 0.714, TPR_inh = 0.9, FPR = 0.05)

Results per neuron below. The area_over_start seems to work: exc is mostly positive (and thus predicted as exc), and vice versa for inhibitory/negative.

te_inh.tested_neurons

71 rows × 5 columns

input_neuron_IDreal_typepredicted_typepvalarea_over_start
Int64SymbolSymbolFloat64Float64
1544excexc0.01156
2345excexc0.0141.6
3468excexc0.0174.1
4206excexc0.01147
5689excunconn0.4716.3
6251excexc0.0187.9
767excexc0.0137.1
8540excexc0.010.983
9331excexc0.0170.3
10603excexc0.0116.6
11696excunconn0.0731.2
12693excinh0.04-10.8
1347excunconn0.3754.6
14533excexc0.01190
15513excexc0.0189.5
16639excexc0.0168.5
17615excexc0.0137.6
1855excinh0.01-26.8
19215excunconn0.0665.5
20411excunconn0.8245.2
21662excunconn0.12-62.9
22826inhunconn0.63-19.9
23846inhexc0.012.51
24815inhinh0.01-31.6
25937inhinh0.01-60.7
26972inhinh0.02-10.4
27857inhinh0.01-112
28874inhinh0.01-43.1
29981inhinh0.01-49.9
30971inhinh0.01-26.3
31896inhinh0.01-65.8
3214unconnunconn0.07-21.8
3365unconnunconn0.2270.9
3476unconnunconn0.9219.3
35172unconnunconn0.0518.3
36179unconnunconn0.56-52.1
37199unconnunconn0.62-39.1
38201unconnunconn0.5-2.87
39258unconnunconn0.87-47.2
40283unconnunconn0.1129.7
41357unconninh0.01-7.26
42385unconninh0.01-38.2
43387unconnunconn0.94-105
44418unconnunconn0.17-42
45424unconnunconn0.96-0.523
46473unconnunconn0.3337.7
47482unconnunconn0.5781.1
48514unconnunconn0.7621.5
49541unconnunconn0.8-37.1
50556unconnunconn0.219.93
51568unconnunconn0.81-37.8
52582unconnunconn0.32-19.5
53600unconnunconn0.16-53.8
54627unconnunconn0.725.6
55638unconnexc0.0129
56659unconnunconn0.66-26.1
57675unconnunconn0.14-83.3
58684unconnunconn0.65-1.37
59734unconnunconn0.4-88.9
60746unconnunconn0.895.16
61777unconnunconn0.55105
62798unconnunconn0.1414.4
63830unconnunconn0.24-9.23
64848unconnunconn0.38-1.16
65889unconnunconn0.79-6.18
66909unconnunconn0.25-18.2
67927unconnunconn0.08-20.9
68934unconnunconn0.43-30.4
69952unconnunconn0.45-20.4
70953unconnunconn0.74-25.5
71964unconnunconn0.25-10.9

We missclassify an inhibitory input as excitatory because it’s “area over start” is positive. What does the STA look like?

plotSTA(s.signals[801].v, s.spike_times[846], p);
../_images/2022-08-21__Area-under-STA_83_0.png

Looks nicely downwards..

area(calc_STA(s.signals[801].v, s.spike_times[846], p))
0.0251

But the area-over-start is indeed positive.

Could be remedied with a shorter STA window length.

For comparison, the STA of a correctly identified inh input:

plotSTA(s.signals[801].v, s.spike_times[815], p);
../_images/2022-08-21__Area-under-STA_88_0.png

Looks very similar (the relative height of the positive and negative bumps).

The problem (the difference) is at t0.


Unrelated question (on previous results): Why are exc input STA’s on inh good, but on exc neurons so bad?