ijazz.RegionalFitter

Classes

RegionalFitter

Fit scale and smearing parameters in category regions.

Module Contents

class ijazz.RegionalFitter.RegionalFitter(dt: pandas.DataFrame, mc: pandas.DataFrame, n_par: int = -1, name_cat='cat', name_mll='mee', name_weights: str = None, min_nevt_region_dt=10, min_nevt_region_mc=100, win_z_dt=(70, 110), win_z_mc=(50, 130), bin_width_dt='Q', bin_width_mc=0.1, double_gaussian=False, single_parameters=[], error_parameters=['resp', 'reso'], fast_hist=False, is_mmg=False, verbose=True)[source]

Fit scale and smearing parameters in category regions.

bin_width_dt = 'Q'[source]

name_cat = 'cat'[source]

name_mll = 'mee'[source]

win_z_dt = (70, 110)[source]

idx_pars = None[source]

eresp = None[source]

ereso = None[source]

corr = None[source]

err = None[source]

cov = None[source]

hess = None[source]

is_mmg = False[source]

verbose = True[source]

double_gaussian = False[source]

single_parameters = [][source]

error_parameters = ['resp', 'reso'][source]

n_par = -1[source]

resp[source]

reso[source]

error_variables[source]

eresp_mc[source]

ereso_mc[source]

regions[source]

idt = 0[source]

imc = 1[source]

regional_electron_indices[source]

n_reg[source]

bins_mc[source]

bins_mean_mc = [][source]

bins_mid_mc[source]

bins_dt = [][source]

bins_width = [][source]

pi_mask[source]

n_ic[source]

m_jc[source]

m_ic[source]

b_ic[source]

b_jc[source]

s2_m_jc[source]

rll_sll(cats=slice(None))[source]

Compute the dilepton rll and sll for each gaussian. In the case of double gaussian, we end up with 4 gaussians.

Parameters:: cats (slice, optional) – categories where to compute the rll and sll. Defaults to slice(None).
Returns:: tuple of rll and sll for each gaussian
Return type:: tuple

pi(cats: slice = slice(None), rll_sll: tuple = None)[source]

Compute the probabilities of being in the bin i (after smearing) in each categories

Parameters:

cats (slice, optional) – categories where to compute the pis. Defaults to slice(None).
rll_sll (tuple, optional) – tuple with the rll and sll for each gaussian. Defaults to None.

Returns:

2D array of probabilities per category

Return type:

np.array

nll_cat(cats: slice = slice(None))[source]

Compute the negative log likelihood (multinomial law)

Parameters:: cats (slice, optional) – categories where to compute the nll. Defaults to slice(None).
Returns:: nll
Return type:: tf.float64

dnll_dmjc(cats)[source]

Compute the gradient of the negative log likelihood with respect to the MC histogram.

Parameters:: cats (slice) – categories where to compute the gradient.
Returns:: gradient of the nll with respect to the MC histogram.
Return type:: tf.float64

nll_batch(**kwargs)[source]

batcher(afunc, batch_size=-1, shuffle=False, cats=slice(None))[source]

Compute the likelihood in batch and then sum-up the result.

Parameters:

batch_size (int) – size of the batch.
shuffle (bool) – shuffle the events before (this is useless in this case but kept for timing tests).

Returns:

the summed likelihood.

Return type:

tf.float64

train_epoch(optimizer, batch_size=-1, batch_training=True, cats=slice(None))[source]

Train a single epoch for the model.

Parameters:

optimizer (tf.keras.optimizers.Optimizer) – The Keras optimizer used to apply gradients.
batch_size (int, optional) – The size of the batch for negative log-likelihood (NLL) computation. Defaults to -1, which means no batching.
batch_training (bool, optional) – If True, updates parameters at each batch for faster training. Defaults to True.
cats (slice, optional) – A slice object to select specific categories of data. Defaults to slice(None).

Returns:

The total negative log-likelihood (NLL) for the epoch.

Return type:

tf.Tensor

minimize(optimizer, dnll_tol=0.1, max_epochs=1000, minimizer='Adam', init_rand=False, nepoch_print=10, init_resp=None, init_reso=None, device='CPU', batch_size=-1, batch_training=True, cats=slice(None))[source]

Minimizes the negative log-likelihood using a TensorFlow optimizer.

Parameters:

optimizer (tf.keras.optimizers.Optimizer) – TensorFlow optimizer to apply gradients.
dnll_tol (float) – Tolerance for the change in -2logL to determine convergence.
max_epochs (int) – Maximum number of epochs for optimization.
minimizer (str) – Optimization method, either ‘Adam’ or a SciPy minimizer (e.g., ‘TNC’).
init_rand (bool) – If True, initializes variables randomly.
nepoch_print (int) – Frequency of printing progress (every nepoch_print epochs).
device (str) – Device to use for computation (‘CPU’ or ‘GPU’).
batch_size (int) – Size of the batch for likelihood computation.
batch_training (bool) – If True, updates parameters at each batch for faster training.

Returns:

List of negative log-likelihood values for each epoch.

Return type:

List[float]

minimize_tf(optimizer, dnll_tol=0.1, max_epochs=1000, init_rand=False, nepoch_print=10, init_resp=None, init_reso=None, device='CPU', batch_size=-1, batch_training=True, cats=slice(None))[source]

Minimizes the negative log-likelihood using a TensorFlow optimizer.

Parameters:

optimizer (tf.keras.optimizers.Optimizer) – TensorFlow optimizer to apply gradients.
dnll_tol (float) – Tolerance for the change in -2logL to determine convergence.
max_epochs (int) – Maximum number of epochs for optimization.
init_rand (bool) – If True, initializes variables randomly.
nepoch_print (int) – Frequency of printing progress (every nepoch_print epochs).
device (str) – Device to use for computation (‘CPU’ or ‘GPU’).
batch_size (int) – Size of the batch for likelihood computation.
batch_training (bool) – If True, updates parameters at each batch for faster training.
cats (slice) – A slice object to select specific categories of data.

Returns:

Array of negative log-likelihood values for each epoch.

Return type:

np.ndarray

minimize_sp(dnll_tol=0.1, minimizer='TNC', init_rand=False, init_resp=None, init_reso=None, device='CPU', batch_size=-1, cats=slice(None))[source]

Minimizes the negative log-likelihood using a SciPy optimizer.

Parameters:

dnll_tol (float) – Tolerance for the change in -2logL to determine convergence.
minimizer (str) – SciPy minimizer method (e.g., ‘TNC’, ‘L-BFGS-B’).
init_rand (bool) – If True, initializes variables randomly.
init_resp (np.ndarray, optional) – Initial values for the response parameters. Defaults to None.
init_reso (np.ndarray, optional) – Initial values for the resolution parameters. Defaults to None.
device (str) – Device to use for computation (‘CPU’ or ‘GPU’).
batch_size (int) – Size of the batch for likelihood computation.
cats (slice) – A slice object to select specific categories of data.

Returns:

Array containing the final negative log-likelihood value.

Return type:

np.ndarray

get_index(e1=None, e2=None, c=None)[source]

Get indices based on electron or pair categories.

Parameters:

e1 (list, optional) – List of electron indices to match.
e2 (list, optional) – List of electron indices to match. If both e1 and e2
provided (are)
matched. (pairs of indices will be)
c (list[tuple], optional) – List of electron index pairs to match.

Returns:

Unique indices matching the categories. Returns all indices if no categories.

Return type:

pd.Index

Notes

If both e1 and e2 are provided, all combinations of pairs, including reversed, are considered.
Assumes self.regional_electron_indices is a pandas DataFrame.

plot_region_fits(cols=5, show_mc=False, n_plots=None, e1=None, e2=None, cats=None, rll_sll=None)[source]

Plot the data and fit for each region.

Parameters:

cols (int) – Number of columns in the plot.
show_mc (bool) – If True, show the Monte Carlo uncorrected.
n_plots (int) – Number of plots to show.
e1 (list) – List of electron indices to match.
e2 (list) – List of electron indices to match. If both e1 and e2
provided (are)
matched. (pairs of indices will be)
cats (list[tuple]) – List of electron index pairs to match.
rll_sll (tuple) – Tuple with the rll and sll for each gaussian.

Returns:

Tuple containing the figure and axes objects.

Return type:

tuple

get_hessian(afunc, numerical=True, **kwargs)[source]

Compute the hessain of a function numerically or analytically

Parameters:

afunc (function) – function to get hessian of
numerical (bool, optional) – type of computation. Defaults to True.

Returns:

hessian matrix

Return type:

np.array

covariance(numerical=True, force=False, **kwargs)[source]

Computes and stores the Hessian, covariance, and correlation matrices.

Parameters:

numerical (bool, optional) – Whether to compute the Hessian numerically. Defaults to True.
force (bool, optional) – If True, recompute all matrices even if they already exist. Defaults to False.
**kwargs – Additional parameters for the nll_batch function (e.g., batch_size).

Returns:

None

calc_err_mc() → None[source]

Compute the uncertainty due to MC limited statistic

Returns:: store the error in dedicated variables in the fitter
Return type:: None