# Discrete logarithm

**Spoiler:** *The discrete logarithm is an algebraic
operation whose difficulty is the basis of many modern cryptographic
algorithms. This article presents the concepts necessary for its
definition as well as its weaknesses in certain specific cases. We start
by using a finite group. We define an operation, its exponentiation and
then, its logarithm.*

After the warm up on the Caesar cipher, we are
continuing our cryptographic journey towards the *discrete
logarithm* problem.

For this journey, we will discuss some *notions-stages* of
General Algebra which will then allow us to define and better understand
this operation so important in cryptography.

# The groups

Who says (algebraic) operation necessarily says (algebraic) structure. A set provided with an operation to manipulate the elements, all obeying fixed rules.

For developers, you can think of algebra as object-oriented programming. Sets are classes, of which objects are the elements. Axioms are like interfaces that your classes must implements.

## Definition

For the present case, we are going to use finite groups. That is to say sets containing a finite number of elements and provided with a binary operation respecting the following 4 properties:

**Internal composition:**the result of the operation is part of the whole,**Associativity:**the result does not change depending on the way of grouping the terms,**Identity element**there is an element (noted $e$ or $0$) which has no effect,**Symmetric:**whatever the element, there is an element that cancels it.

It is customary to denote a group as a pair containing the set then the operation: $(E, \oplus)$.

## Example with Booleans

The simplest and easiest finite group to demonstrate is the set of
Booleans (true and false, noted ⊤ and ⟂) provided with the *exclusive
or* (noted ⊕).

a | b | a ⊕ b |
---|---|---|

⊤ | ⊤ | ⟂ |

⊤ | ⟂ | ⊤ |

⟂ | ⊤ | ⊤ |

⟂ | ⟂ | ⟂ |

**Internal :**the result of the operation is always boolean,**Associative :**$a \oplus (b \oplus c) = (a \oplus b) \oplus c$, I let you check it.**Identity:***false*is the identity element as the result will be the value of the other operand,**Symétrique :**any element is it self inverse, $a \oplus a = \perp$.

If you want to get to the root of it, the single element set also forms a group with any internal operation but doesn’t frankly have a hectic use.

## Example with integers

Closer to developers and cryptography, the sets of integers
*modulo
$n$*,
where we keep only the rest of the divisions by
$n$,
are groups with the addition
$(Z/nZ,+)$
but also with the multiplication
$(Z/nZ,.)$.

Here is an example in `C++`

for groups
$(Z/nZ,+)$
for
$n$
a base integer. To get a multiplicative group, just implement
`operator*`

. For larger groups, the implementation would have
to be changed.

```
template<unsigned int N>
class Field
{
public:
unsigned int value ;
public:
() : value(0) {}
Field(unsigned int i) : value(i % N) { }
Field
};
template <const unsigned int N>
<N> operator+(Field<N> const & lhs, Field<N> const & rhs)
Field{
return Field<N>(lhs.value + rhs.value) ;
}
```

I chose to implement the operators outside the class to allow me to introduce new ones in the rest of this article without having to rewrite all the code and allow you to add the examples to your file as you go. Here are two other handy operators for manipulating group elements:

```
#include <iostream>
/**
* Allow writing elements on a stream
*/
template <typename Stream, const unsigned int N>
& operator<<(Stream & stream, Field<N> const & elem)
Stream {
<< "{" << elem.value << "}" ;
stream return stream ;
}
/**
* Allow equality check between elements
*/
template <const unsigned int N>
bool operator==(const Field<N> & lhs, Field<N> const & rhs)
{
return lhs.value == rhs.value ;
}
```

# Generator and generated group

For each element of a group $(E,\oplus)$, we can build a sequence by applying the element ($a$) iteratively as follows:

$\begin{align} u_0 & = 0\\ u_1 & = a\\ u_2 & = a \oplus a\\ u_3 & = a \oplus a \oplus a \\ u_n & = a \oplus \dots \oplus a \end{align}$

To simplify the writing, we encounter two shortcuts notation:

- $u_n = n.a$ in the case of additive groups,
- $u_n = a^n$ in the case of multiplicative groups.
When there is no clear evidente of which case you are in, prefer the second notation as it denotes better the

exponentialnature of the problem.

We then say that $a$ generates the subgroup formed of the successive values of the sequence. In the case of finite groups, this sequence is cyclic (it eventually comes back to the identity element and repeat itself later) and provides a very practical correspondence with $(Z/nZ,\oplus)$ where $n$ is the number of elements of the sequence (we say that it is an isomorphism).

The computation of an element of the sequence is an *easy*
calculation, its complexity is
proportional to the size of the number. To continue on the example in
`C++`

, the following code is the implementation of the
generator, in additive notation.

```
/**
* Additive notation of generation
*/
template <const unsigned int N>
<N> operator*(unsigned int coef, Field<N> const & base)
Field{
if (coef == 0) {
return Field<N>() ;
} else if (coef == 1) {
return base ;
} else {
<N> big = (coef / 2) * base ;
Field<N> little = (coef % 2) * base ;
Fieldreturn big + big + little ;
}
}
```

We can note that the implementation of this operator as an internal composition of the group gives it a structure of ring . The set is then provided with two operators respecting some practical properties (but which we will not tackle to stay in the subject).

```
template <const unsigned int N>
<N> operator*(Field<N> const & base, Field<N> const & exp)
Field{
return base.value * exp ;
}
```

# The Discrete Logarithm

We then arrive at the definition of the *discrete logarithm*
which consists quite simply in reversing this correspondence between
$n$
and
$u_n$.

The discrete logarithm of $a$ in base $g$, which we denote by $log_g(a)$, is the integer $n$ such that $a=g^n$.

Unlike the exponentiation which was *easy*, the computation of
the discrete logarithm is *hard*. The security of most algorithm
(*e.g.* Diffie-Hellmann
key exchange and the cryptosystem
El Gamal) is based on this disparity of complexity (easy in one way,
hard the other way).

But this difficulty of resolution depends on the group and the chosen generator. We will therefore discuss the weaknesses of the group $(Z/nZ,+)$ and $(Z/nZ,.)$. Other groups are also used, such as elliptical curves, but will not be discussed.

## Weaknesses of $(Z/nZ, +)$

The groups for which the computation of the *discrete
logarithm* is easy are thus deprecated. This is the case of the
additive group
$(Z/nZ,+)$
which serves as an example of implementation from the beginning.

Indeed, this group has some properties which facilitate the task of an attacker. As we saw before, it is possible to build a ring from the group by adding the multiplication (equivalent here to the exponentiation calculation).

Solving the logarithm is here equivalent to computing a division in the ring $(Z/nZ,+,.)$. If the generator is prime with $n$, it has an inverse for the multiplication which we can compute thanks to the extended Euclidean algorithm which also has a polynomial complexity.

The following `C++`

code shows an implementation of this
algorithm:

```
template <const unsigned int N>
<N> inverse(Field<N> const & elem)
Field{
int r1 = elem.value ; int u1 = 1 ; int v1 = 0 ;
int r2 = N ; int u2 = 0 ; int v2 = 1 ;
while (r2 != 0) {
int q = r1 / r2 ;
int r3 = r1 - q * r2 ;
int u3 = u1 - q * u2 ;
int v3 = v1 - q * v2 ;
= r2 ; u1 = u2 ; v1 = v2 ;
r1 = r3 ; u2 = u3 ; v2 = v3 ;
r2 }
return Field<N>(u1) ;
}
```

With this inverse in the multiplicative meaning, the calculation of
the *discrete logarithm* is immediate since the effects of this
inverse cancel those of the generator:

$n . a . a^{-1} = n . 1 = n$

Here comes its implementation:

```
template <const unsigned int N>
<N> log(Field<N> const & elem, Field<N> const & base)
Field{
<N> i = inverse(base) ;
Fieldreturn elem * i ;
}
```

If the generator is not prime with
$n$,
the generated group does not cover all the values of
$Z/nZ$
but only a part: the multiples of the *gcd* of
$n$
and
$g$.
By dividing the numbers and the generator by this *gcd*, we
obtain a new group on which the resolution of the discrete logarithm
gives the same value. *Et voila*.

## Weaknesses of $(Z/nZ, .)$

The choice of generator is also important for multiplicative groups like $(Z/nZ,.)$. These have two risks:

- Only generate a subgroup of a ridiculously small size, making a brute force (or dictionary) attack feasible,
- Be vulnerable to
*Pohlig-Hellman*reduction.

The Pohlig-Hellman reduction.When the order of the group is a multiple of two numbers, we can divide the computation of the logarithm into the two corresponding subgroups and then combine the results. A sort of shortcut that saves a lot of time.

We therefore impose the following constraints when building a group to use it in cryptography:

- $n$ must be large, so that its resolution is long, typically 2048 bits according to ANSSI recommendations,
- $n$ must be a prime number, to guarantee that the generators generates the whole group,
- $(n-1)/2$
is also a prime number, to avoid weakening the system via a computation
with
*Pohlig-Hellman algorighm*, - $g$ must be of order $n-1$, so spawn the whole group and avoid a brute force attack.

These precautions being taken, the best algorithms to solve the
*discrete logarithm* remain exponential in the size of
$n$.
This is precisely the desired goal and allows its use in a cryptographic
context.

# And after ?

The *discrete logarithm* is a rather classic algebraic
operation in its definition and whose computation hardness depends both
on the group used and on the generator chosen. It is this hardness that
is used in other algorithms like Diffie-Hellmann key exchange
and the El Gamal cryptosystem.