ijazz.RegionalFitter

Classes

RegionalFitter

Fit scale and smearing parameters in category regions.

Module Contents

class ijazz.RegionalFitter.RegionalFitter(dt: pandas.DataFrame, mc: pandas.DataFrame, n_par: int = -1, name_cat='cat', name_mll='mee', name_weights: str = None, min_nevt_region_dt=10, min_nevt_region_mc=100, win_z_dt=(70, 110), win_z_mc=(50, 130), bin_width_dt='Q', bin_width_mc=0.1, double_gaussian=False, single_parameters=[], error_parameters=['resp', 'reso'], fast_hist=False, is_mmg=False, verbose=True)[source]

Fit scale and smearing parameters in category regions.

bin_width_dt = 'Q'[source]
name_cat = 'cat'[source]
name_mll = 'mee'[source]
win_z_dt = (70, 110)[source]
idx_pars = None[source]
eresp = None[source]
ereso = None[source]
corr = None[source]
err = None[source]
cov = None[source]
hess = None[source]
is_mmg = False[source]
verbose = True[source]
double_gaussian = False[source]
single_parameters = [][source]
error_parameters = ['resp', 'reso'][source]
n_par = -1[source]
resp[source]
reso[source]
error_variables[source]
eresp_mc[source]
ereso_mc[source]
regions[source]
idt = 0[source]
imc = 1[source]
regional_electron_indices[source]
n_reg[source]
bins_mc[source]
bins_mean_mc = [][source]
bins_mid_mc[source]
bins_dt = [][source]
bins_width = [][source]
pi_mask[source]
n_ic[source]
m_jc[source]
m_ic[source]
b_ic[source]
b_jc[source]
s2_m_jc[source]
rll_sll(cats=slice(None))[source]

Compute the dilepton rll and sll for each gaussian. In the case of double gaussian, we end up with 4 gaussians.

Parameters:

cats (slice, optional) – categories where to compute the rll and sll. Defaults to slice(None).

Returns:

tuple of rll and sll for each gaussian

Return type:

tuple

pi(cats: slice = slice(None), rll_sll: tuple = None)[source]

Compute the probabilities of being in the bin i (after smearing) in each categories

Parameters:
  • cats (slice, optional) – categories where to compute the pis. Defaults to slice(None).

  • rll_sll (tuple, optional) – tuple with the rll and sll for each gaussian. Defaults to None.

Returns:

2D array of probabilities per category

Return type:

np.array

nll_cat(cats: slice = slice(None))[source]

Compute the negative log likelihood (multinomial law)

Parameters:

cats (slice, optional) – categories where to compute the nll. Defaults to slice(None).

Returns:

nll

Return type:

tf.float64

dnll_dmjc(cats)[source]

Compute the gradient of the negative log likelihood with respect to the MC histogram.

Parameters:

cats (slice) – categories where to compute the gradient.

Returns:

gradient of the nll with respect to the MC histogram.

Return type:

tf.float64

nll_batch(**kwargs)[source]
batcher(afunc, batch_size=-1, shuffle=False, cats=slice(None))[source]

Compute the likelihood in batch and then sum-up the result.

Parameters:
  • batch_size (int) – size of the batch.

  • shuffle (bool) – shuffle the events before (this is useless in this case but kept for timing tests).

Returns:

the summed likelihood.

Return type:

tf.float64

train_epoch(optimizer, batch_size=-1, batch_training=True, cats=slice(None))[source]

Train a single epoch for the model.

Parameters:
  • optimizer (tf.keras.optimizers.Optimizer) – The Keras optimizer used to apply gradients.

  • batch_size (int, optional) – The size of the batch for negative log-likelihood (NLL) computation. Defaults to -1, which means no batching.

  • batch_training (bool, optional) – If True, updates parameters at each batch for faster training. Defaults to True.

  • cats (slice, optional) – A slice object to select specific categories of data. Defaults to slice(None).

Returns:

The total negative log-likelihood (NLL) for the epoch.

Return type:

tf.Tensor

minimize(optimizer, dnll_tol=0.1, max_epochs=1000, minimizer='Adam', init_rand=False, nepoch_print=10, init_resp=None, init_reso=None, device='CPU', batch_size=-1, batch_training=True, cats=slice(None))[source]

Minimizes the negative log-likelihood using a TensorFlow optimizer.

Parameters:
  • optimizer (tf.keras.optimizers.Optimizer) – TensorFlow optimizer to apply gradients.

  • dnll_tol (float) – Tolerance for the change in -2logL to determine convergence.

  • max_epochs (int) – Maximum number of epochs for optimization.

  • minimizer (str) – Optimization method, either ‘Adam’ or a SciPy minimizer (e.g., ‘TNC’).

  • init_rand (bool) – If True, initializes variables randomly.

  • nepoch_print (int) – Frequency of printing progress (every nepoch_print epochs).

  • device (str) – Device to use for computation (‘CPU’ or ‘GPU’).

  • batch_size (int) – Size of the batch for likelihood computation.

  • batch_training (bool) – If True, updates parameters at each batch for faster training.

Returns:

List of negative log-likelihood values for each epoch.

Return type:

List[float]

minimize_tf(optimizer, dnll_tol=0.1, max_epochs=1000, init_rand=False, nepoch_print=10, init_resp=None, init_reso=None, device='CPU', batch_size=-1, batch_training=True, cats=slice(None))[source]

Minimizes the negative log-likelihood using a TensorFlow optimizer.

Parameters:
  • optimizer (tf.keras.optimizers.Optimizer) – TensorFlow optimizer to apply gradients.

  • dnll_tol (float) – Tolerance for the change in -2logL to determine convergence.

  • max_epochs (int) – Maximum number of epochs for optimization.

  • init_rand (bool) – If True, initializes variables randomly.

  • nepoch_print (int) – Frequency of printing progress (every nepoch_print epochs).

  • device (str) – Device to use for computation (‘CPU’ or ‘GPU’).

  • batch_size (int) – Size of the batch for likelihood computation.

  • batch_training (bool) – If True, updates parameters at each batch for faster training.

  • cats (slice) – A slice object to select specific categories of data.

Returns:

Array of negative log-likelihood values for each epoch.

Return type:

np.ndarray

minimize_sp(dnll_tol=0.1, minimizer='TNC', init_rand=False, init_resp=None, init_reso=None, device='CPU', batch_size=-1, cats=slice(None))[source]

Minimizes the negative log-likelihood using a SciPy optimizer.

Parameters:
  • dnll_tol (float) – Tolerance for the change in -2logL to determine convergence.

  • minimizer (str) – SciPy minimizer method (e.g., ‘TNC’, ‘L-BFGS-B’).

  • init_rand (bool) – If True, initializes variables randomly.

  • init_resp (np.ndarray, optional) – Initial values for the response parameters. Defaults to None.

  • init_reso (np.ndarray, optional) – Initial values for the resolution parameters. Defaults to None.

  • device (str) – Device to use for computation (‘CPU’ or ‘GPU’).

  • batch_size (int) – Size of the batch for likelihood computation.

  • cats (slice) – A slice object to select specific categories of data.

Returns:

Array containing the final negative log-likelihood value.

Return type:

np.ndarray

get_index(e1=None, e2=None, c=None)[source]

Get indices based on electron or pair categories.

Parameters:
  • e1 (list, optional) – List of electron indices to match.

  • e2 (list, optional) – List of electron indices to match. If both e1 and e2

  • provided (are)

  • matched. (pairs of indices will be)

  • c (list[tuple], optional) – List of electron index pairs to match.

Returns:

Unique indices matching the categories. Returns all indices if no categories.

Return type:

pd.Index

Notes

  • If both e1 and e2 are provided, all combinations of pairs, including reversed, are considered.

  • Assumes self.regional_electron_indices is a pandas DataFrame.

plot_region_fits(cols=5, show_mc=False, n_plots=None, e1=None, e2=None, cats=None, rll_sll=None)[source]

Plot the data and fit for each region.

Parameters:
  • cols (int) – Number of columns in the plot.

  • show_mc (bool) – If True, show the Monte Carlo uncorrected.

  • n_plots (int) – Number of plots to show.

  • e1 (list) – List of electron indices to match.

  • e2 (list) – List of electron indices to match. If both e1 and e2

  • provided (are)

  • matched. (pairs of indices will be)

  • cats (list[tuple]) – List of electron index pairs to match.

  • rll_sll (tuple) – Tuple with the rll and sll for each gaussian.

Returns:

Tuple containing the figure and axes objects.

Return type:

tuple

get_hessian(afunc, numerical=True, **kwargs)[source]

Compute the hessain of a function numerically or analytically

Parameters:
  • afunc (function) – function to get hessian of

  • numerical (bool, optional) – type of computation. Defaults to True.

Returns:

hessian matrix

Return type:

np.array

covariance(numerical=True, force=False, **kwargs)[source]

Computes and stores the Hessian, covariance, and correlation matrices.

Parameters:
  • numerical (bool, optional) – Whether to compute the Hessian numerically. Defaults to True.

  • force (bool, optional) – If True, recompute all matrices even if they already exist. Defaults to False.

  • **kwargs – Additional parameters for the nll_batch function (e.g., batch_size).

Returns:

None

calc_err_mc() None[source]

Compute the uncertainty due to MC limited statistic

Returns:

store the error in dedicated variables in the fitter

Return type:

None