# Categorical¶

class probnum.randvars.Categorical(probabilities, support=None)

Categorical random variable.

Parameters
• probabilities (ndarray) – Probabilities of the events.

• support (Optional[ndarray]) – Support of the categorical distribution. Optional. Default is None, in which case the support is chosen as $$(0, ..., K-1)$$ where $$K$$ is the number of elements in event_probabilities.

Attributes Summary

 T Transpose the random variable. cov Covariance $$\operatorname{Cov}(X) = \mathbb{E}((X-\mathbb{E}(X))(X-\mathbb{E}(X))^\top)$$ of the random variable. dtype Data type of (elements of) a realization of this random variable. entropy Information-theoretic entropy $$H(X)$$ of the random variable. mean Mean $$\mathbb{E}(X)$$ of the random variable. median Median of the random variable. median_dtype The dtype of the median. mode Mode of the random variable. moment_dtype The dtype of any (function of a) moment of the random variable, e.g. ndim Number of dimensions of realizations of the random variable. parameters Parameters of the associated probability distribution. probabilities Event probabilities of the categorical distribution. shape Shape of realizations of the random variable. size Size of realizations of the random variable, defined as the product over all components of shape(). std Standard deviation of the random variable. support Support of the categorical distribution. var Variance $$\operatorname{Var}(X) = \mathbb{E}((X-\mathbb{E}(X))^2)$$ of the random variable.

Methods Summary

 cdf(x) Cumulative distribution function. in_support(x) Check whether the random variable takes value x with non-zero probability, i.e. if x is in the support of its distribution. infer_median_dtype(value_dtype) Infer the dtype of the median. infer_moment_dtype(value_dtype) Infer the dtype of any moment. logcdf(x) Log-cumulative distribution function. logpmf(x) Natural logarithm of the probability mass function. pmf(x) Probability mass function. quantile(p) Quantile function. resample(rng) Resample the support of the categorical random variable. reshape(newshape) Give a new shape to a random variable. sample(rng[, size]) Draw realizations from a random variable. transpose(*axes) Transpose the random variable.

Attributes Documentation

T

Transpose the random variable.

Parameters

axes (int) – See documentation of numpy.ndarray.transpose().

Return type

RandomVariable

cov

Covariance $$\operatorname{Cov}(X) = \mathbb{E}((X-\mathbb{E}(X))(X-\mathbb{E}(X))^\top)$$ of the random variable.

To learn about the dtype of the covariance, see moment_dtype.

dtype

Data type of (elements of) a realization of this random variable.

Return type

dtype

entropy

Information-theoretic entropy $$H(X)$$ of the random variable.

mean

Mean $$\mathbb{E}(X)$$ of the random variable.

To learn about the dtype of the mean, see moment_dtype.

median

Median of the random variable.

To learn about the dtype of the median, see median_dtype.

median_dtype

The dtype of the median.

It will be set to the dtype arising from the multiplication of values with dtypes dtype and numpy.float_. This is motivated by the fact that, even for discrete random variables, e.g. integer-valued random variables, the median might lie in between two values in which case these values are averaged. For example, a uniform random variable on $$\{ 1, 2, 3, 4 \}$$ will have a median of $$2.5$$.

Return type

dtype

mode

Mode of the random variable.

moment_dtype

The dtype of any (function of a) moment of the random variable, e.g. its mean, cov, var, or std. It will be set to the dtype arising from the multiplication of values with dtypes dtype and numpy.float_. This is motivated by the mathematical definition of a moment as a sum or an integral over products of probabilities and values of the random variable, which are represented as using the dtypes numpy.float_ and dtype, respectively.

Return type

dtype

ndim

Number of dimensions of realizations of the random variable.

parameters

Parameters of the associated probability distribution.

The parameters of the probability distribution of the random variable, e.g. mean, variance, scale, rate, etc. stored in a dict.

Return type

Dict[str, Any]

probabilities

Event probabilities of the categorical distribution.

Return type

ndarray

shape

Shape of realizations of the random variable.

Return type

Tuple[int, …]

size

Size of realizations of the random variable, defined as the product over all components of shape().

std

Standard deviation of the random variable.

To learn about the dtype of the standard deviation, see moment_dtype.

support

Support of the categorical distribution.

Return type

ndarray

var

Variance $$\operatorname{Var}(X) = \mathbb{E}((X-\mathbb{E}(X))^2)$$ of the random variable.

To learn about the dtype of the variance, see moment_dtype.

Methods Documentation

cdf(x)

Cumulative distribution function.

Parameters

x (~ValueType) – Evaluation points of the cumulative distribution function. The shape of this argument should be (..., S1, ..., SN), where (S1, ..., SN) is the shape of the random variable. The cdf evaluation will be broadcast over all additional dimensions.

Return type

float64

in_support(x)

Check whether the random variable takes value x with non-zero probability, i.e. if x is in the support of its distribution.

Parameters

x (~ValueType) – Input value.

Return type

bool

static infer_median_dtype(value_dtype)

Infer the dtype of the median.

Set the dtype to the dtype arising from the multiplication of values with dtypes dtype and numpy.float_. This is motivated by the fact that, even for discrete random variables, e.g. integer-valued random variables, the median might lie in between two values in which case these values are averaged. For example, a uniform random variable on $$\{ 1, 2, 3, 4 \}$$ will have a median of $$2.5$$.

Parameters

value_dtype (Union[dtype, str]) – Dtype of a value.

Return type

dtype

static infer_moment_dtype(value_dtype)

Infer the dtype of any moment.

Infers the dtype of any (function of a) moment of the random variable, e.g. its mean, cov, var, or std. Returns the dtype arising from the multiplication of values with dtypes dtype and numpy.float_. This is motivated by the mathematical definition of a moment as a sum or an integral over products of probabilities and values of the random variable, which are represented as using the dtypes numpy.float_ and dtype, respectively.

Parameters

value_dtype (Union[dtype, str]) – Dtype of a value.

Return type

dtype

logcdf(x)

Log-cumulative distribution function.

Parameters

x (~ValueType) – Evaluation points of the cumulative distribution function. The shape of this argument should be (..., S1, ..., SN), where (S1, ..., SN) is the shape of the random variable. The logcdf evaluation will be broadcast over all additional dimensions.

Return type

float64

logpmf(x)

Natural logarithm of the probability mass function.

Parameters

x (~ValueType) – Evaluation points of the log-probability mass function. The shape of this argument should be (..., S1, ..., SN), where (S1, ..., SN) is the shape of the random variable. The logpmf evaluation will be broadcast over all additional dimensions.

Return type

float64

pmf(x)

Probability mass function.

Computes the probability of the random variable being equal to the given value. For a random variable $$X$$ it is defined as $$p_X(x) = P(X = x)$$ for a probability measure $$P$$.

Probability mass functions are the discrete analogue of probability density functions in the sense that they are the Radon-Nikodym derivative of the pushforward measure $$P \circ X^{-1}$$ defined by the random variable with respect to the counting measure.

Parameters

x (~ValueType) – Evaluation points of the probability mass function. The shape of this argument should be (..., S1, ..., SN), where (S1, ..., SN) is the shape of the random variable. The pmf evaluation will be broadcast over all additional dimensions.

Return type

float64

quantile(p)

Quantile function.

The quantile function $$Q \colon [0, 1] \to \mathbb{R}$$ of a random variable $$X$$ is defined as $$Q(p) = \inf\{ x \in \mathbb{R} \colon p \le F_X(x) \}$$, where $$F_X \colon \mathbb{R} \to [0, 1]$$ is the cdf() of the random variable. From the definition it follows that the quantile function always returns values of the same dtype as the random variable. For instance, for a discrete distribution over the integers, the returned quantiles will also be integers. This means that, in general, $$Q(0.5)$$ is not equal to the median as it is defined in this class. See https://en.wikipedia.org/wiki/Quantile_function for more details and examples.

Return type

~ValueType

resample(rng)[source]

Resample the support of the categorical random variable.

Return a new categorical random variable (RV), where the support is randomly chosen from the elements in the current support with probabilities given by the current event probabilities. The probabilities of the resulting categorical RV are all equal.

Parameters

rng (Generator) – Random number generator.

Returns

Categorical random variable with resampled support (according to self.probabilities).

Return type

Categorical

reshape(newshape)

Give a new shape to a random variable.

Parameters

newshape (Union[Integral, Iterable[Integral]]) – New shape for the random variable. It must be compatible with the original shape.

Return type

RandomVariable

sample(rng, size=())

Draw realizations from a random variable.

Parameters
Return type

~ValueType

transpose(*axes)

Transpose the random variable.

Parameters

axes (int) – See documentation of numpy.ndarray.transpose().

Return type

RandomVariable