ijazz.RegionalFitter
Classes
Fit scale and smearing parameters in category regions. |
Module Contents
- class ijazz.RegionalFitter.RegionalFitter(dt: pandas.DataFrame, mc: pandas.DataFrame, n_par: int = -1, name_cat='cat', name_mll='mee', name_weights: str = None, min_nevt_region_dt=10, min_nevt_region_mc=100, win_z_dt=(70, 110), win_z_mc=(50, 130), bin_width_dt='Q', bin_width_mc=0.1, double_gaussian=False, single_parameters=[], error_parameters=['resp', 'reso'], fast_hist=False, is_mmg=False, verbose=True)[source]
Fit scale and smearing parameters in category regions.
- rll_sll(cats=slice(None))[source]
Compute the dilepton rll and sll for each gaussian. In the case of double gaussian, we end up with 4 gaussians.
- Parameters:
cats (slice, optional) – categories where to compute the rll and sll. Defaults to slice(None).
- Returns:
tuple of rll and sll for each gaussian
- Return type:
tuple
- pi(cats: slice = slice(None), rll_sll: tuple = None)[source]
Compute the probabilities of being in the bin i (after smearing) in each categories
- Parameters:
cats (slice, optional) – categories where to compute the pis. Defaults to slice(None).
rll_sll (tuple, optional) – tuple with the rll and sll for each gaussian. Defaults to None.
- Returns:
2D array of probabilities per category
- Return type:
np.array
- nll_cat(cats: slice = slice(None))[source]
Compute the negative log likelihood (multinomial law)
- Parameters:
cats (slice, optional) – categories where to compute the nll. Defaults to slice(None).
- Returns:
nll
- Return type:
tf.float64
- dnll_dmjc(cats)[source]
Compute the gradient of the negative log likelihood with respect to the MC histogram.
- Parameters:
cats (slice) – categories where to compute the gradient.
- Returns:
gradient of the nll with respect to the MC histogram.
- Return type:
tf.float64
- batcher(afunc, batch_size=-1, shuffle=False, cats=slice(None))[source]
Compute the likelihood in batch and then sum-up the result.
- Parameters:
batch_size (int) – size of the batch.
shuffle (bool) – shuffle the events before (this is useless in this case but kept for timing tests).
- Returns:
the summed likelihood.
- Return type:
tf.float64
- train_epoch(optimizer, batch_size=-1, batch_training=True, cats=slice(None))[source]
Train a single epoch for the model.
- Parameters:
optimizer (tf.keras.optimizers.Optimizer) – The Keras optimizer used to apply gradients.
batch_size (int, optional) – The size of the batch for negative log-likelihood (NLL) computation. Defaults to -1, which means no batching.
batch_training (bool, optional) – If True, updates parameters at each batch for faster training. Defaults to True.
cats (slice, optional) – A slice object to select specific categories of data. Defaults to slice(None).
- Returns:
The total negative log-likelihood (NLL) for the epoch.
- Return type:
tf.Tensor
- minimize(optimizer, dnll_tol=0.1, max_epochs=1000, minimizer='Adam', init_rand=False, nepoch_print=10, init_resp=None, init_reso=None, device='CPU', batch_size=-1, batch_training=True, cats=slice(None))[source]
Minimizes the negative log-likelihood using a TensorFlow optimizer.
- Parameters:
optimizer (tf.keras.optimizers.Optimizer) – TensorFlow optimizer to apply gradients.
dnll_tol (float) – Tolerance for the change in -2logL to determine convergence.
max_epochs (int) – Maximum number of epochs for optimization.
minimizer (str) – Optimization method, either ‘Adam’ or a SciPy minimizer (e.g., ‘TNC’).
init_rand (bool) – If True, initializes variables randomly.
nepoch_print (int) – Frequency of printing progress (every nepoch_print epochs).
device (str) – Device to use for computation (‘CPU’ or ‘GPU’).
batch_size (int) – Size of the batch for likelihood computation.
batch_training (bool) – If True, updates parameters at each batch for faster training.
- Returns:
List of negative log-likelihood values for each epoch.
- Return type:
List[float]
- minimize_tf(optimizer, dnll_tol=0.1, max_epochs=1000, init_rand=False, nepoch_print=10, init_resp=None, init_reso=None, device='CPU', batch_size=-1, batch_training=True, cats=slice(None))[source]
Minimizes the negative log-likelihood using a TensorFlow optimizer.
- Parameters:
optimizer (tf.keras.optimizers.Optimizer) – TensorFlow optimizer to apply gradients.
dnll_tol (float) – Tolerance for the change in -2logL to determine convergence.
max_epochs (int) – Maximum number of epochs for optimization.
init_rand (bool) – If True, initializes variables randomly.
nepoch_print (int) – Frequency of printing progress (every nepoch_print epochs).
device (str) – Device to use for computation (‘CPU’ or ‘GPU’).
batch_size (int) – Size of the batch for likelihood computation.
batch_training (bool) – If True, updates parameters at each batch for faster training.
cats (slice) – A slice object to select specific categories of data.
- Returns:
Array of negative log-likelihood values for each epoch.
- Return type:
np.ndarray
- minimize_sp(dnll_tol=0.1, minimizer='TNC', init_rand=False, init_resp=None, init_reso=None, device='CPU', batch_size=-1, cats=slice(None))[source]
Minimizes the negative log-likelihood using a SciPy optimizer.
- Parameters:
dnll_tol (float) – Tolerance for the change in -2logL to determine convergence.
minimizer (str) – SciPy minimizer method (e.g., ‘TNC’, ‘L-BFGS-B’).
init_rand (bool) – If True, initializes variables randomly.
init_resp (np.ndarray, optional) – Initial values for the response parameters. Defaults to None.
init_reso (np.ndarray, optional) – Initial values for the resolution parameters. Defaults to None.
device (str) – Device to use for computation (‘CPU’ or ‘GPU’).
batch_size (int) – Size of the batch for likelihood computation.
cats (slice) – A slice object to select specific categories of data.
- Returns:
Array containing the final negative log-likelihood value.
- Return type:
np.ndarray
- get_index(e1=None, e2=None, c=None)[source]
Get indices based on electron or pair categories.
- Parameters:
e1 (list, optional) – List of electron indices to match.
e2 (list, optional) – List of electron indices to match. If both e1 and e2
provided (are)
matched. (pairs of indices will be)
c (list[tuple], optional) – List of electron index pairs to match.
- Returns:
Unique indices matching the categories. Returns all indices if no categories.
- Return type:
pd.Index
Notes
If both e1 and e2 are provided, all combinations of pairs, including reversed, are considered.
Assumes self.regional_electron_indices is a pandas DataFrame.
- plot_region_fits(cols=5, show_mc=False, n_plots=None, e1=None, e2=None, cats=None, rll_sll=None)[source]
Plot the data and fit for each region.
- Parameters:
cols (int) – Number of columns in the plot.
show_mc (bool) – If True, show the Monte Carlo uncorrected.
n_plots (int) – Number of plots to show.
e1 (list) – List of electron indices to match.
e2 (list) – List of electron indices to match. If both e1 and e2
provided (are)
matched. (pairs of indices will be)
cats (list[tuple]) – List of electron index pairs to match.
rll_sll (tuple) – Tuple with the rll and sll for each gaussian.
- Returns:
Tuple containing the figure and axes objects.
- Return type:
tuple
- get_hessian(afunc, numerical=True, **kwargs)[source]
Compute the hessain of a function numerically or analytically
- Parameters:
afunc (function) – function to get hessian of
numerical (bool, optional) – type of computation. Defaults to True.
- Returns:
hessian matrix
- Return type:
np.array
- covariance(numerical=True, force=False, **kwargs)[source]
Computes and stores the Hessian, covariance, and correlation matrices.
- Parameters:
numerical (bool, optional) – Whether to compute the Hessian numerically. Defaults to True.
force (bool, optional) – If True, recompute all matrices even if they already exist. Defaults to False.
**kwargs – Additional parameters for the nll_batch function (e.g., batch_size).
- Returns:
None