In information theory, the cross entropy between two probability distributions p and q over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set, if a coding scheme is used that is optimized for an "unnatural" probability distribution q, rather than the "true" distribution p.
https://www.wikiwand.com/en/Cross_entropy
Cross entropy is used in Machine learning, as a Loss function