combinatorics Combinatorics is an area of mathematics primarily concerned with counting, both as a means and as an end to obtaining results, and certain properties of finite structures. It is closely related to many other areas of mathematics and has many ...

, a branch of

mathematics Mathematics is a field of study that discovers and organizes methods, Mathematical theory, theories and theorems that are developed and Mathematical proof, proved for the needs of empirical sciences and mathematics itself. There are many ar ...

, a weighted matroid is a

matroid In combinatorics, a matroid is a structure that abstracts and generalizes the notion of linear independence in vector spaces. There are many equivalent ways to define a matroid Axiomatic system, axiomatically, the most significant being in terms ...

endowed with a function that assigns a weight to each element. Formally, let

M = (E, I)

be a matroid, where ''E'' is the set of elements and ''I'' is the family of independent set. A weighted matroid has a ''weight function''

w : E \rightarrow \mathbb^+

for assigns a strictly positive weight to each element of

E

. We extend the function to subsets of

E

by summation;

w(A)

is the sum of

w(x)

over

x

A

Finding a maximum-weight independent set

A basic problem regarding weighted matroids is to find an independent set with a maximum total weight. This problem can be solved using the following simple

greedy algorithm A greedy algorithm is any algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage. In many problems, a greedy strategy does not produce an optimal solution, but a greedy heuristic can yield locally ...

: * Initialize the set ''A'' to an empty set. Note that, by definition of a matroid, ''A'' is an independent set. * For each element ''x'' in ''E''\''A'', check whether Au is still an independent set. ** If there are no such elements, then stop, as ''A'' cannot be extended anymore. ** If there is at least one such element, then choose the one with maximum weight, and add it to A. This algorithm does not need to know anything about the matroid structure; it just needs an independence oracle for the matroid - a subroutine for testing whether a set is independent.

Jack Edmonds Jack R. Edmonds (born April 5, 1934) is an American-born and educated computer scientist and mathematician who lived and worked in Canada for much of his life. He has made fundamental contributions to the fields of combinatorial optimization, po ...

proved that this simple algorithm indeed finds an independent set with maximum weight. Denote the set found by the algorithm by e₁,...,e_k. By the matroid properties, it is clear that k=rank(M), otherwise the set could be extended. Assume by contradiction that there is another set with a higher weight. Without loss of generality, it is possible to assume that this set has rank(M) elements too; denote it by f₁,...,f_k. Order these items such that w(f₁) ≥ ... ≥ w(f_k). Let ''j'' be the first index for which w(f_j) > w(e_j). Apply the augmentation property to the sets and ; we conclude that there must be some ''i'' ≤ j such that f_i could be added to while keeping it independent. But w(f_i) ≥ w(f_j) > w(e_j), so f_i should have been chosen in step j instead of e_j - a contradiction.

Example: spanning forest algorithms

As a simple example, say we wish to find the maximum spanning forest of a graph. That is, given a graph and a weight for each edge, find a forest containing every vertex and maximizing the total weight of the edges in the tree. This problem arises in some clustering applications. It can be solved by

Kruskal's algorithm Kruskal's algorithm finds a minimum spanning forest of an undirected edge-weighted graph. If the graph is connected, it finds a minimum spanning tree. It is a greedy algorithm that in each step adds to the forest the lowest-weight edge that ...

, which can be seen as the special case of the above greedy algorithm to a graphical matroid. If we look at the definition of the forest matroid, we see that the maximum spanning forest is simply the independent set with largest total weight — such a set must span the graph, for otherwise we can add edges without creating cycles. But how do we find it?

Finding a basis

There is a simple algorithm for finding a basis: * Initially let

A

be the empty set. * For each

x

E

** if

A \cup \

is independent, then set

A

A \cup \

. The result is clearly an independent set. It is a

maximal independent set In graph theory, a maximal independent set (MIS) or maximal stable set is an Independent set (graph theory), independent set that is not a subset of any other independent set. In other words, there is no Vertex (graph theory), vertex outside th ...

because if

B \cup \

is not independent for some subset

B

A

, then

A \cup \

is not independent either (the contrapositive follows from the

hereditary property In mathematics, a hereditary property is a property of an object that is inherited by all of its subobjects, where the meaning of ''subobject'' depends on the context. These properties are particularly considered in topology and graph theory, but a ...

). Thus if we pass up an element, we'll never have an opportunity to use it later. We will generalize this algorithm to solve a harder problem.

Extension to optimal

An independent set of largest total weight is called an ''optimal'' set. Optimal sets are always bases, because if an edge can be added, it should be; this only increases the total weight. As it turns out, there is a trivial

for computing an optimal set of a weighted matroid. It works as follows: * Initially let

A

be the empty set. * For each

x

E

, taken in (monotonically) decreasing order by weight ** if

A \cup \

is independent, then set

A

A \cup \

. This algorithm finds a basis, since it is a special case of the above algorithm. It always chooses the element of largest weight that it can while preserving independence (thus the term "greedy"). This always produces an optimal set: suppose that it produces

A=\

and that

B=\

. Now for any

k

with

1\le k\le r

, consider open sets

O_1=\

and

O_2=\

. Since

O_1

is smaller than

O_2

, there is some element of

O_2

which can be put into

O_1

with the result still being independent. However

e_k

is an element of maximal weight that can be added to

O_1

to maintain independence. Thus

e_k

is of no smaller weight than some element of

O_2

, and hence

e_k

is of at least a large a weight as

f_k

. As this is true for all

k

A

is weightier than

B

Complexity analysis

The easiest way to traverse the members of

E

in the desired order is to sort them. This requires

O(, E, \log, E, )

time using a comparison

sorting algorithm In computer science, a sorting algorithm is an algorithm that puts elements of a List (computing), list into an Total order, order. The most frequently used orders are numerical order and lexicographical order, and either ascending or descending ...

. We also need to test for each

x

whether

A \cup \

is independent; assuming independence tests require

O(f(, E, ))

time, the total time for the algorithm is

O(, E, \log, E,  + , E, f(, E, ))

. If we want to find a

minimum spanning tree A minimum spanning tree (MST) or minimum weight spanning tree is a subset of the edges of a connected, edge-weighted undirected graph that connects all the vertices together, without any cycles and with the minimum possible total edge weight. ...

instead, we simply "invert" the weight function by subtracting it from a large constant. More specifically, let

w_(x) = W - w(x)

, where

W

exceeds the total weight over all graph edges. Many more optimization problems about all sorts of matroids and weight functions can be solved in this trivial way, although in many cases more efficient algorithms can be found that exploit more specialized properties.

Matroid requirement

Note also that if we take a set

I

of "independent" sets which is a down-set but not a matroid, then the greedy algorithm will not always work. For then there are independent sets

I_1

and

I_2

with

, I_1, <, I_2,

, but such that for no

e\in I_2\setminus I_1

I_1\cup e

independent. Pick an

\epsilon>0

and

\tau>0

such that

(1+2\epsilon), I_1, +\tau, E, <, I_2,

. Weight the elements of

I_1\cup I_2

in the range

2

2+2\epsilon

, the elements of

I_1\setminus I_2

in the range

1+\epsilon

1+2\epsilon

, the elements of

I_2\setminus I_1

in the range

1

1+\epsilon

, and the rest in the range

0

\tau

. The greedy algorithm will select the elements of

I_1

, and then cannot pick any elements of

I_2\setminus I_1

. Therefore, the independent set it constructs will be of weight at most

(1+2\epsilon), I_1, +\tau, E, +, I_1\cup I_2,

, which is smaller than the weight of

I_2

Characterization

This optimization algorithm may be used to characterize matroids: if a family ''F'' of sets, closed under taking subsets, has the property that, no matter how the sets are weighted, the greedy algorithm finds a maximum-weight set in the family, then ''F'' must be the family of independent sets of a matroid.

Generalizations

The notion of matroid has been generalized to allow for other types of sets on which a greedy algorithm gives optimal solutions; see

greedoid In combinatorics, a greedoid is a type of set system. It arises from the notion of the matroid, which was originally introduced by Hassler Whitney, Whitney in 1935 to study planar graphs and was later used by Jack Edmonds, Edmonds to characterize ...

and

matroid embedding In combinatorics, a matroid embedding is a set system (''F'', ''E''), where ''F'' is a collection of ''feasible sets'', that satisfies the following properties. # Accessibility property: Every non-empty feasible set ''X'' contains an elem ...

for more information. Korte and Lovász would generalize these ideas to objects called ''

s'', which allow even larger classes of problems to be solved by greedy algorithms.

References

{{Reflist Matroid theory