Other dances

Basic concepts of probability theory and mathematical statistics. The basic concept of probability theory. Laws of probability theory

Theory of Probability and Mathematical Statistics


1. THEORETICAL PART


1 Convergence of sequences of random variables and probability distributions


In probability theory one has to deal with different types of convergence of random variables. Consider the following main types of convergence: by probability, with probability one, by average of order p, by distribution.

Let, ... be random variables given on some probability space (, Ф, P).

Definition 1. A sequence of random variables, ... is called converging in probability to a random variable (notation:), if for any > 0


Definition 2. A sequence of random variables, ... is called converging with probability one (almost probably, almost everywhere) to a random variable if


those. if the set of outcomes for which () do not converge to () has zero probability.

This type of convergence is denoted as follows: , or, or.

Definition 3. A sequence of random variables, ... is called convergent on average of order p, 0< p < , если


Definition 4. A sequence of random variables, ... is called converging in distribution to a random variable (notation:), if for any bounded continuous function


The convergence in the distribution of random variables is defined only in terms of the convergence of their distribution functions. Therefore, it makes sense to talk about this kind of convergence even when the random variables are given on different probability spaces.

Theorem 1.

a) In order for (P-a.s.), it is necessary and sufficient that for any > 0

) The sequence () is fundamental with probability one if and only if for any > 0.

Proof.

a) Let A \u003d (: | - | ), A \u003d A. Then



Therefore statement a) is the result of the following chain of implications:

Р(: )= 0 P() = 0 = 0 Р(А) = 0, m 1 P(A) = 0, > 0 P() 0, n 0, > 0 P( ) 0,

n 0, > 0.) Denote = (: ), = . Then (: (()) is not fundamental ) = and just as in a) it is shown that (: (()) is not fundamental ) = 0 P( ) 0, n.

Theorem proven


Theorem 2. (Cauchy's almost certain convergence criterion)

In order for a sequence of random variables () to converge with probability one (to some random variable), it is necessary and sufficient that it be fundamental with probability one.

Proof.

If, then +

whence follows the necessity of the condition of the theorem.

Now let the sequence () be fundamental with probability one. Denote L = (: (()) not fundamental). Then for all the numerical sequence () is fundamental and, according to the Cauchy criterion for numerical sequences, there exists (). Let's put



The function defined in this way is a random variable and.

The theorem has been proven.


2 Method of characteristic functions


The method of characteristic functions is one of the main tools of the analytical apparatus of probability theory. Along with random variables (taking real values), the theory of characteristic functions requires the use of complex-valued random variables.

Many of the definitions and properties relating to random variables can be easily transferred to the complex case. So, the mathematical expectation of M ?complex-valued random variable ?=?+?? is considered certain if the mathematical expectations M ?them ?. In this case, by definition, we set M ?= M ? + ?M ?. It follows from the definition of the independence of random elements that complex-valued quantities ?1 =?1+??1 , ?2=?2+??2are independent if and only if the pairs of random variables ( ?1 , ?1) And ( ?2 , ?2), or, what is the same, independent ?-algebra F ?1, ?1 and F ?2, ?2.

Along with the space L 2real random variables with a finite second moment, we can introduce into consideration the Hilbert space of complex-valued random variables ?=?+?? with M | ?|2?|2= ?2+?2, and the scalar product ( ?1 , ?2)= M ?1?2¯ , Where ?2¯ is a complex conjugate random variable.

In algebraic operations, the vectors Rn are considered as algebraic columns,



As a row vector, a* - (a1,a2,…,an). If Rn , then their scalar product (a,b) will be understood as a quantity. It's clear that

If aRn and R=||rij|| is a matrix of order nxn, then



Definition 1. Let F = F(х1,….,хn) be an n-dimensional distribution function in (, ()). Its characteristic function is called the function


Definition 2 . If? = (?1,…,?n) is a random vector defined on a probability space with values ​​in, then its characteristic function is the function



where is F? = F?(х1,….,хn) - vector distribution function?=(?1, … , ?n).

If the distribution function F(x) has density f = f(x), then



In this case, the characteristic function is nothing but the Fourier transform of the function f(x).

From (3) it follows that the characteristic function ??(t) of a random vector can also be defined by the equality



Basic properties of characteristic functions (in the case of n=1).

Let be? = ?(?) - random variable, F? =F? (x) - its distribution function and - characteristic function.

It should be noted that if, then.



Indeed,

where we used the fact that the mathematical expectation of the product of independent (limited) random variables is equal to the product of their mathematical expectations.

Property (6) is the key one in proving limit theorems for sums of independent random variables by the method of characteristic functions. In this regard, the distribution function is expressed in terms of the distribution functions of individual terms in a much more complex way, namely, where the * sign means the convolution of distributions.

Each distribution function in can be associated with a random variable that has this function as its distribution function. Therefore, when presenting the properties of characteristic functions, we can restrict ourselves to considering the characteristic functions of random variables.

Theorem 1. Let be? - random variable with distribution function F=F(х) and - its characteristic function.

The following properties take place:

) is uniformly continuous in;

) is a real-valued function if and only if the distribution of F is symmetric


)if for some n ? 1 , then for all there are derivatives and



)If exists and is finite, then

)Let for all n ? 1 and


then for all |t|

The following theorem shows that the characteristic function uniquely determines the distribution function.

Theorem 2 (uniqueness). Let F and G be two distribution functions having the same characteristic function, that is, for all



The theorem says that the distribution function F = F(x) is uniquely recovered from its characteristic function. The following theorem gives an explicit representation of the function F in terms of.

Theorem 3 (generalization formula). Let F = F(x) be the distribution function and be its characteristic function.

a) For any two points a, b (a< b), где функция F = F(х) непрерывна,


) If then the distribution function F(x) has density f(x),



Theorem 4. For the components of a random vector to be independent, it is necessary and sufficient that its characteristic function be the product of the characteristic functions of the components:


Bochner-Khinchin theorem . Let be a continuous function. In order to be characteristic, it is necessary and sufficient that it be non-negative-definite, that is, for any real t1, ..., tn and any complex numbers



Theorem 5. Let be the characteristic function of a random variable.

a) If for some, then the random variable is a lattice variable with a step, that is


) If for two different points, where is an irrational number, then a random variable? is degenerate:



where a is some constant.

c) If, then a random variable? degenerate.


1.3 Central limit theorem for independent identically distributed random variables


Let () be a sequence of independent, identically distributed random variables. Mathematical expectation M= a, variance D= , S = , and Ф(х) - distribution function of the normal law with parameters (0,1). We also introduce a sequence of random variables



Theorem. If 0<<, то при n P(< x) Ф(х) равномерно относительно х ().

In this case, the sequence () is called asymptotically normal.

From the fact that M = 1 and from the continuity theorems it follows that, along with the weak convergence, FM f() Mf() for any continuous bounded f, there is also a convergence of M f() Mf() for any continuous f, such that |f(x)|< c(1+|x|) при каком-нибудь.

Proof.

The uniform convergence here is a consequence of the weak convergence and continuity of Φ(x). Further, without loss of generality, we can assume a = 0, since otherwise it would be possible to consider the sequence (), while the sequence () would not change. Therefore, to prove the required convergence, it suffices to show that (t) e when a = 0. We have

(t) = , where =(t).


Since M exists, there exists and the decomposition



Therefore, for n

The theorem has been proven.


1.4 The main tasks of mathematical statistics, their brief description


The establishment of regularities to which mass random phenomena are subject is based on the study of statistical data - the results of observations. The first task of mathematical statistics is to indicate the methods of collecting and grouping statistical information. The second task of mathematical statistics is to develop methods for analyzing statistical data, depending on the objectives of the study.

When solving any problem of mathematical statistics, two sources of information are available. The first and most definite (explicit) is the result of observations (experiment) in the form of a sample from some general population of a scalar or vector random variable. In this case, the sample size n can be fixed, or it can increase during the experiment (i.e., the so-called sequential procedures of statistical analysis can be used).

The second source is all a priori information about the properties of interest of the object under study, which has been accumulated up to the current moment. Formally, the amount of a priori information is reflected in the initial statistical model that is chosen when solving the problem. However, there is no need to talk about the approximate in the usual sense determination of the probability of an event based on the results of experiments. An approximate definition of a quantity usually means that it is possible to indicate the limits of errors, from which the error will not come out. The frequency of the event is random for any number of experiments due to the randomness of the results of individual experiments. Due to the randomness of the results of individual experiments, the frequency may deviate significantly from the probability of an event. Therefore, defining the unknown probability of an event as the frequency of this event for a large number of experiments, we cannot indicate the limits of the error and guarantee that the error will not go beyond these limits. Therefore, in mathematical statistics, they usually talk not about the approximate values ​​of unknown quantities, but about their appropriate values, estimates.

The problem of estimating unknown parameters arises when the distribution function of the general population is known up to a parameter. In this case, it is necessary to find such a statistic whose sample value for the considered implementation xn of a random sample could be considered an approximate value of the parameter. A statistic, the sample value of which for any realization xn is taken as an approximate value of an unknown parameter, is called its point estimate or simply estimate, and - the value of the point estimate. A point estimate must satisfy well-defined requirements in order for its sample value to correspond to the true value of the parameter.

Another approach to solving the problem under consideration is also possible: to find such statistics and so that with probability? the following inequality was fulfilled:



In this case, one speaks of an interval estimate for. Interval



is called the confidence interval for with the confidence factor?.

Having estimated one or another statistical characteristic based on the results of experiments, the question arises: how consistent with the experimental data is the assumption (hypothesis) that the unknown characteristic has exactly the value that was obtained as a result of its evaluation? This is how the second important class of problems in mathematical statistics arises - problems of testing hypotheses.

In a sense, the task of testing a statistical hypothesis is the inverse of the problem of parameter estimation. When evaluating a parameter, we know nothing about its true value. When testing a statistical hypothesis, for some reason, its value is assumed to be known and it is necessary to verify this assumption based on the results of the experiment.

In many problems of mathematical statistics, sequences of random variables are considered that converge in one sense or another to a certain limit (random variable or constant), when.

Thus, the main tasks of mathematical statistics are the development of methods for finding estimates and the study of the accuracy of their approximation to the estimated characteristics and the development of methods for testing hypotheses.


5 Statistical Hypothesis Testing: Basic Concepts


The task of developing rational methods for testing statistical hypotheses is one of the main tasks of mathematical statistics. A statistical hypothesis (or simply a hypothesis) is any statement about the form or properties of the distribution of random variables observed in an experiment.

Let there be a sample that is a realization of a random sample from the general population, the distribution density of which depends on an unknown parameter.

Statistical hypotheses about an unknown true value of a parameter are called parametric hypotheses. Moreover, if is a scalar, then we are talking about one-parameter hypotheses, and if a vector, then about multi-parameter hypotheses.

A statistical hypothesis is called simple if it has the form

where is some given value of the parameter.

A statistical hypothesis is called complex if it has the form


where - some set of parameter values, consisting of more than one element.

In the case of testing two simple statistical hypotheses of the form

where are two given (different) values ​​of the parameter, the first hypothesis is usually called the main one, and the second one is called the alternative or competing hypothesis.

The criterion, or statistical criterion, for testing hypotheses is the rule according to which, according to the sample data, a decision is made about the validity of either the first or second hypothesis.

The criterion is specified using a critical set, which is a subset of the sample space of a random sample. The decision is made as follows:

) if the sample belongs to the critical set, then the main hypothesis is rejected and the alternative hypothesis is accepted;

) if the sample does not belong to the critical set (i.e., belongs to the complement of the set to the sample space), then the alternative hypothesis is rejected and the main hypothesis is accepted.

When using any criterion, errors of the following types are possible:

1) accept the hypothesis when it is true - an error of the first kind;

) to accept the hypothesis when it is true - an error of the second kind.

The probabilities of making errors of the first and second kind denote and:

where is the probability of the event, provided that the hypothesis is true. The indicated probabilities are calculated using the density function of the distribution of a random sample:

The probability of making a Type I error is also called the significance level of the test.

The value equal to the probability of rejecting the main hypothesis when it is true is called the power of the criterion.


1.6 Independence criterion


There is a sample ((XY), …, (XY)) from a bivariate distribution

L with an unknown distribution function, for which it is required to test the hypothesis H: , where are some one-dimensional distribution functions.

A simple goodness-of-fit test for hypothesis H can be constructed based on the methodology. This technique is used for discrete models with a finite number of outcomes, so we agree to assume that a random variable takes a finite number s of some values, which we will denote by letters, and the second component - k values. If the original model has a different structure, then the possible values ​​of random variables are preliminarily grouped separately according to the first and second components. In this case, the set is divided into s intervals, the value set - into k intervals, and the value set itself - into N=sk rectangles.

Denote by the number of observations of the pair (the number of sample elements belonging to the rectangle, if the data is grouped), so that. The results of observations are conveniently arranged in the form of a conjugation table of two signs (Table 1.1). In applications, and usually mean two features by which the results of observation are classified.

Let Р, i=1,…,s, j=1,…,k. Then the independence hypothesis means that there are s+k constants such that and, i.e.


Table 1.1

Sum . . .. . .. . . . . .. . .. . . . . . . . . . . . . . .Sum . . .n

Thus, the hypothesis H is reduced to the statement that the frequencies (their number is equal to N = sk) are distributed according to the polynomial law with the probabilities of outcomes having the specified specific structure (the vector of probabilities of outcomes p is determined by the values ​​of r=s+k-2 unknown parameters.

To test this hypothesis, we find maximum likelihood estimates for the unknown parameters that determine the scheme under consideration. If the null hypothesis is true, then the likelihood function has the form L(p)= where the factor c does not depend on the unknown parameters. Hence, using the method of indefinite Lagrange multipliers, we obtain that the desired estimates have the form

Therefore, the statistics

L() at, since the number of degrees of freedom in the limit distribution is N-1-r=sk-1-(s+k-2)=(s-1)(k-1).

So, for sufficiently large n, the following hypothesis testing rule can be used: the hypothesis H is rejected if and only if the statistic value calculated from the actual data satisfies the inequality

This criterion has an asymptotically (at) given significance level and is called the independence criterion.

2. PRACTICAL PART


1 Solutions of problems on the types of convergence


1. Prove that convergence almost certainly implies convergence in probability. Give a test case showing that the converse is not true.

Solution. Let a sequence of random variables converge to a random variable x almost surely. So for anyone? > 0

Since then

and the convergence of xn to x almost certainly implies that xn converges to x in probability, since in this case

But the converse is not true. Let be a sequence of independent random variables having the same distribution function F(x) equal to zero at x? 0 and equal for x > 0. Consider the sequence


This sequence converges to zero in probability, since

tends to zero for any fixed? And. However, convergence to zero will almost certainly not take place. Really

tends to unity, that is, with probability 1 for any and n in the sequence there are implementations that exceed ?.

Note that in the presence of some additional conditions imposed on xn, convergence in probability implies almost sure convergence.

Let xn be a monotone sequence. Prove that in this case the convergence of xn to x in probability implies the convergence of xn to x with probability 1.

Solution. Let xn be a monotonically decreasing sequence, i.e. To simplify our reasoning, we assume that x º 0, xn ³ 0 for all n. Let xn converge to x in probability, but convergence almost certainly does not take place. Does it exist then? > 0 such that for all n


But what has been said also means that for all n

which contradicts the convergence of xn to x in probability. Thus, for a monotone sequence xn converging to x in probability, convergence with probability 1 (almost certainly) also takes place.

Let the sequence xn converge to x in probability. Prove that from this sequence it is possible to select a sequence converging to x with probability 1 as.

Solution. Let be some sequence of positive numbers, moreover, and be such positive numbers that the series. Let us construct a sequence of indices n1

Then the series


Since the series converges, then for any? > 0 the remainder of the series tends to zero. But then tends to zero and



Prove that convergence in average of any positive order implies convergence in probability. Give an example showing that the converse is not true.

Solution. Let the sequence xn converge to x on the average of order p > 0, i.e.



Let's use the generalized Chebyshev's inequality: for arbitrary? > 0 and p > 0



Letting and taking into account that, we get that



that is, xn converges to x in probability.

However, convergence in probability does not entail convergence on average of order p > 0. This is shown by the following example. Consider the probability space áW, F , Rñ, where F = B is the Borel s-algebra and R is the Lebesgue measure.

We define a sequence of random variables as follows:

The sequence xn converges to 0 in probability, because



but for any p > 0



that is, convergence on average will not have.

Let, at what for all n . Prove that in this case xn converges to x in mean square.

Solution. Let us note that. Get an estimate for. Consider a random variable. Let be? is an arbitrary positive number. Then at and at.



If, then and. Hence, . And because? arbitrarily small and, then at, that is, in the root mean square.

Prove that if xn converges to x in probability, then we have weak convergence. Give a test case showing that the converse is not true.

Solution. Let us prove that if, then at each point x, which is a point of continuity (this is a necessary and sufficient condition for weak convergence), is the distribution function of xn, and are x.

Let x be a point of continuity of the function F. If, then at least one of the inequalities or is true. Then



Similarly, for at least one of the inequalities or and






If, then for arbitrarily small? > 0 there exists N such that for all n > N



On the other hand, if x is a point of continuity, is it possible to find such a thing? > 0, which for an arbitrarily small



So, for arbitrarily small? and there exists N such that for n >N




or, which is the same,



This means that u converges at all points of continuity. Therefore, convergence in probability implies weak convergence.

The converse assertion, generally speaking, does not hold. To verify this, we take a sequence of random variables that are not constant with probability 1 and have the same distribution function F(x). We assume that for all n the quantities and are independent. Obviously, weak convergence takes place, since all members of the sequence have the same distribution function. Consider:

| From independence and the same distribution of quantities, it follows that




Let us choose among all distribution functions of nondegenerate random variables such F(x) that will be different from zero for all sufficiently small ?. Then does not tend to zero as n grows without limit, and convergence in probability will not take place.

7. Let weak convergence take place, where with probability 1 is a constant. Prove that in this case will converge to in probability.

Solution. Let with probability 1 be equal to a. Then weak convergence means convergence for any. Since, then with and with. That is, with and with. It follows that for any > 0 probabilities



tend to zero at. It means that

tends to zero at, that is, converges to in probability.

2.2 Solving problems at the CPT


The value of the gamma function Г(x) at x= is calculated by the Monte Carlo method. Let us find the minimum number of trials necessary to ensure that with a probability of 0.95 one can expect that the relative error of the calculations will be less than one percent.

For up to we have



It is known that



Making a change in (1), we arrive at an integral over a finite interval:



We have therefore


As can be seen, it can be represented in the form where and is distributed uniformly on. Let produced statistical tests. Then the statistical analogue is the quantity



where, are independent random variables with a uniform distribution. Wherein



It follows from the CLT that is asymptotically normal with parameters.






This means that the minimum number of tests that provides with probability a relative calculation error is no more than equal to.


We consider a sequence of 2000 independent identically distributed random variables with mathematical expectation equal to 4 and variance equal to 1.8. The arithmetic mean of these quantities is a random variable. Determine the probability that a random variable will take a value in the interval (3.94; 4.12).

Let, …,… be a sequence of independent random variables having the same distribution with M=a=4 and D==1.8. Then the CLT is applicable to the sequence (). Random value

The probability that it will take a value in the interval ():



For n=2000, 3.94 and 4.12 we get



3 Testing hypotheses by the criterion of independence


As a result of the study, it was found that 782 light-eyed fathers also have light-eyed sons, and 89 light-eyed fathers have dark-eyed sons. 50 dark-eyed fathers also have dark-eyed sons, and 79 dark-eyed fathers have light-eyed sons. Is there a relationship between the eye color of fathers and the color of the eyes of their sons? The level of confidence is taken equal to 0.99.


Table 2.1

ChildrenFathersSumLight-eyedDark-eyedLight-eyed78279861Dark-eyed8950139Sum8711291000

H: There is no relationship between the color of the eyes of children and fathers.

H: there is a relationship between the color of the eyes of children and fathers.



s=k=2=90.6052 from 1 steps of freedom

The calculation was made in Mathematica 6.

Since > , then the hypothesis H, about the absence of a relationship between the color of the eyes of fathers and children, at a significance level, should be rejected and the alternative hypothesis H should be accepted.


It is claimed that the effect of the drug depends on the method of application. Check this statement according to the data presented in Table. 2.2 The level of confidence is taken equal to 0.95.


Table 2.2

Result Method of application ABC unfavorable 111716 Favorable 202319

Solution.

To solve this problem, we use the contingency table of two features.


Table 2.3

Result Method of application Sum ABC Unfavorable 11171644 Favorable 20231962 Sum 314035106

H: the effect of drugs does not depend on the method of application

H: the effect of drugs depends on the method of application

The statistic is calculated following the following formula



s=2, k=3, =0.734626 with 2 steps of freedom.


Calculation made in Mathematica 6

According to the distribution tables, we find that.

Because the< , то гипотезу H, про отсутствия зависимости действия лекарств от способа применения, при уровне значимости, следует принять.


Conclusion


This paper presents theoretical calculations from the section "Independence criterion", as well as "Limit theorems of probability theory", the course "Probability theory and mathematical statistics". In the course of the work, the independence criterion was tested in practice; also, for given sequences of independent random variables, the fulfillment of the central limit theorem was verified.

This work helped to improve my knowledge of these sections of probability theory, work with literary sources, and to master the technique of testing the independence criterion.

probabilistic statistical hypothesis theorem

Link List


1. Collection of problems from the theory of probability with a solution. Uch. allowance / Ed. V.V. Semenets. - Kharkov: HTURE, 2000. - 320s.

Gikhman I.I., Skorokhod A.V., Yadrenko M.I. Theory of Probability and Mathematical Statistics. - K .: Vishcha school, 1979. - 408 p.

Ivchenko G.I., Medvedev Yu.I., Mathematical statistics: Proc. allowance for universities. - M.: Higher. school, 1984. - 248s., .

Mathematical statistics: Proc. for universities / V.B. Goryainov, I.V. Pavlov, G.M. Tsvetkova and others; Ed. V.S. Zarubina, A.P. Krishchenko. - M.: Publishing house of MSTU im. N.E. Bauman, 2001. - 424p.


Tutoring

Need help learning a topic?

Our experts will advise or provide tutoring services on topics of interest to you.
Submit an application indicating the topic right now to find out about the possibility of obtaining a consultation.

Theory of Probability and Mathematical Statistics

  • Agekyan T.A. Fundamentals of Error Theory for Astronomers and Physicists (2nd ed.). Moscow: Nauka, 1972 (djvu , 2.44M)
  • Agekyan T.A. The theory of probability for astronomers and physicists. Moscow: Nauka, 1974 (djvu , 2.59M)
  • Anderson T. Statistical analysis of time series. M.: Mir, 1976 (djvu , 14M)
  • Bakelman I.Ya. Werner A.L. Kantor B.E. Introduction to differential geometry "in the large". Moscow: Nauka, 1973 (djvu , 5.71M)
  • Bernstein S.N. Probability Theory. M.-L.: GI, 1927 (djvu , 4.51M)
  • Billingsley P. Convergence of probability measures. Moscow: Nauka, 1977 (djvu , 3.96M)
  • Box J. Jenkins G. Analysis of time series: forecast and management. Issue 1. M.: Mir, 1974 (djvu , 3.38M)
  • Box J. Jenkins G. Analysis of time series: forecast and management. Issue 2. M.: Mir, 1974 (djvu , 1.72M)
  • Borel E. Probability and reliability. Moscow: Nauka, 1969 (djvu , 1.19M)
  • Van der Waerden B.L. Math statistics. M.: IL, 1960 (djvu , 6.90M)
  • Vapnik V.N. Recovery of dependences on empirical data. Moscow: Nauka, 1979 (djvu , 6.18M)
  • Wentzel E.S. Introduction to operations research. M.: Soviet radio, 1964 (djvu , 8.43M)
  • Wentzel E.S. Elements of Game Theory (2nd ed.). Series: Popular lectures on mathematics. Issue 32. M.: Nauka, 1961 (djvu, 648K)
  • Venztel E.S. Probability Theory (4th ed.). Moscow: Nauka, 1969 (djvu, 8.05M)
  • Venztel E.S., Ovcharov L.A. Probability Theory. Tasks and exercises. Moscow: Nauka, 1969 (djvu , 7.71M)
  • Vilenkin N.Ya., Potapov V.G. Taskbook-workshop on the theory of probability with elements of combinatorics and mathematical statistics. M.: Enlightenment, 1979 (djvu , 1.12M)
  • Gmurman V.E. A Guide to Problem Solving in Probability and Mathematical Statistics (3rd ed.). M.: Higher. school, 1979 (djvu , 4.24M)
  • Gmurman V.E. Probability Theory and Mathematical Statistics (4th ed.). Moscow: Higher school, 1972 (djvu , 3.75M)
  • Gnedenko B.V., Kolmogorov A.N. Limit distributions for sums of independent random variables. M.-L.: GITTL, 1949 (djvu , 6.26M)
  • Gnedenko B.V., Khinchin A.Ya. An Elementary Introduction to Probability Theory (7th ed.). Moscow: Nauka, 1970 (djvu , 2.48M)
  • Oak J.L. Probabilistic processes. M.: IL, 1956 (djvu , 8.48M)
  • David G. Order statistics. Moscow: Nauka, 1979 (djvu , 2.87M)
  • Ibragimov I.A., Linnik Yu.V. Independent and stationary quantities. Moscow: Nauka, 1965 (djvu , 6.05M)
  • Idie W., Dryyard D., James F., Rus M., Sadoulé B. Statistical methods in experimental physics. Moscow: Atomizdat, 1976 (djvu , 5.95M)
  • Kamalov M.K. Distribution of quadratic forms in samples from a normal population. Tashkent: Academy of Sciences of the Uzbek SSR, 1958 (djvu , 6.29M)
  • Kassandrova O.N., Lebedev V.V. Processing of observation results. Moscow: Nauka, 1970 (djvu, 867K)
  • Katz M. Probability and related issues in physics. M.: Mir, 1965 (djvu , 3.67M)
  • Katz M. Several probabilistic problems of physics and mathematics. Moscow: Nauka, 1967 (djvu , 1.50M)
  • Katz M. Statistical independence in probability theory, analysis and number theory. M.: IL, 1963 (djvu , 964K)
  • Kendall M., Moran P. Geometric probabilities. Moscow: Nauka, 1972 (djvu , 1.40M)
  • Kendall M., Stewart A. Volume 2. Statistical Inference and Relationships. Moscow: Nauka, 1973 (djvu , 10M)
  • Kendall M., Stewart A. Volume 3. Multivariate statistical analysis and time series. Moscow: Nauka, 1976 (djvu , 7.96M)
  • Kendall M., Stewart A. Vol. 1. Theory of distributions. Moscow: Nauka, 1965 (djvu, 6.02M)
  • Kolmogorov A.N. Basic concepts of probability theory (2nd ed.) M.: Nauka, 1974 (djvu , 2.14M)
  • Kolchin V.F., Sevastyanov B.A., Chistyakov V.P. Random placements. Moscow: Nauka, 1976 (djvu , 2.96M)
  • Cramer G. Mathematical Methods of Statistics (2nd ed.). M.: Mir, 1976 (djvu , 9.63M)
  • Leman E. Verification of statistical hypotheses. M.: Science. 1979 (djvu, 5.18M)
  • Linnik Yu.V., Ostrovsky I.V. Decompositions of random variables and vectors. Moscow: Nauka, 1972 (djvu , 4.86M)
  • Likholetov I.I., Matskevich I.P. A Guide to Problem Solving in Higher Mathematics, Probability, and Mathematical Statistics (2nd ed.). Mn.: Vysh. school, 1969 (djvu , 4.99M)
  • Loev M. Probability Theory. M.: IL, 1962 (djvu , 7.38M)
  • Malakhov A.N. Cumulant analysis of random non-Gaussian processes and their transformations. M.: Sov. radio, 1978 (djvu , 6.72M)
  • Meshalkin L.D. Collection of problems in the theory of probability. Moscow: Moscow State University, 1963 (djvu , 1004K)
  • Mitropolsky A.K. Theory of moments. M.-L.: GIKSL, 1933 (djvu , 4.49M)
  • Mitropolsky A.K. The Technique of Statistical Computing (2nd ed.). Moscow: Nauka, 1971 (djvu , 8.35M)
  • Mosteller F., Rourke R., Thomas J. Probability. M.: Mir, 1969 (djvu , 4.82M)
  • Nalimov V.V. Application of mathematical statistics in the analysis of matter. M.: GIFML, 1960 (djvu, 4.11M)
  • Neveu Zh. Mathematical foundations of probability theory. M.: Mir, 1969 (djvu , 3.62M)
  • Preston K. Mathematics. New in Foreign Science No.7. Gibbs states on countable sets. M.: Mir, 1977 (djvu , 2.15M)
  • Saveliev L.Ya. Elementary probability theory. Part 1. Novosibirsk: NGU, 2005 (

INTRODUCTION

Many things are incomprehensible to us, not because our concepts are weak;
but because these things do not enter the circle of our concepts.
Kozma Prutkov

The main goal of studying mathematics in secondary specialized educational institutions is to give students a set of mathematical knowledge and skills necessary for studying other program disciplines that use mathematics to one degree or another, for the ability to perform practical calculations, for the formation and development of logical thinking.

In this paper, all the basic concepts of the section of mathematics "Fundamentals of Probability Theory and Mathematical Statistics", provided for by the program and the State Educational Standards of Secondary Vocational Education (Ministry of Education of the Russian Federation. M., 2002), are consistently introduced, the main theorems are formulated, most of which are not proved . The main tasks and methods for their solution and technologies for applying these methods to solving practical problems are considered. The presentation is accompanied by detailed comments and numerous examples.

Methodological instructions can be used for initial acquaintance with the material being studied, when taking notes of lectures, for preparing for practical exercises, for consolidating the acquired knowledge, skills and abilities. In addition, the manual will be useful for undergraduate students as a reference tool that allows you to quickly restore in memory what was previously studied.

At the end of the work, examples and tasks are given that students can perform in self-control mode.

Methodological instructions are intended for students of correspondence and full-time forms of education.

BASIC CONCEPTS

Probability theory studies the objective regularities of mass random events. It is a theoretical basis for mathematical statistics, dealing with the development of methods for collecting, describing and processing the results of observations. Through observations (tests, experiments), i.e. experience in the broad sense of the word, there is a knowledge of the phenomena of the real world.

In our practical activities, we often encounter phenomena, the outcome of which cannot be predicted, the result of which depends on chance.

A random phenomenon can be characterized by the ratio of the number of its occurrences to the number of trials, in each of which, under the same conditions of all trials, it could occur or not occur.

Probability theory is a branch of mathematics in which random phenomena (events) are studied and regularities are revealed when they are massively repeated.

Mathematical statistics is a branch of mathematics that has as its subject the study of methods for collecting, systematizing, processing and using statistical data to obtain scientifically based conclusions and decision-making.

At the same time, statistical data is understood as a set of numbers that represent the quantitative characteristics of the features of the studied objects that are of interest to us. Statistical data are obtained as a result of specially designed experiments and observations.

Statistical data in its essence depend on many random factors, so mathematical statistics is closely related to probability theory, which is its theoretical basis.

I. PROBABILITY. THEOREMS OF ADDITION AND PROBABILITY MULTIPLICATION

1.1. Basic concepts of combinatorics

In the section of mathematics called combinatorics, some problems are solved related to the consideration of sets and the compilation of various combinations of elements of these sets. For example, if we take 10 different numbers 0, 1, 2, 3,:, 9 and make combinations of them, we will get different numbers, for example 143, 431, 5671, 1207, 43, etc.

We see that some of these combinations differ only in the order of the digits (for example, 143 and 431), others in the numbers included in them (for example, 5671 and 1207), and others also differ in the number of digits (for example, 143 and 43).

Thus, the obtained combinations satisfy various conditions.

Depending on the compilation rules, three types of combinations can be distinguished: permutations, placements, combinations.

Let's first get acquainted with the concept factorial.

The product of all natural numbers from 1 to n inclusive is called n-factorial and write.

Calculate: a) ; b) ; V) .

Solution. A) .

b) as well as , then you can take it out of brackets

Then we get

V) .

Permutations.

A combination of n elements that differ from each other only in the order of the elements is called a permutation.

Permutations are denoted by the symbol P n , where n is the number of elements in each permutation. ( R- the first letter of the French word permutation- permutation).

The number of permutations can be calculated using the formula

or with factorial:

Let's remember that 0!=1 and 1!=1.

Example 2. In how many ways can six different books be arranged on one shelf?

Solution. The desired number of ways is equal to the number of permutations of 6 elements, i.e.

Accommodations.

Placements from m elements in n in each, such compounds are called that differ from each other either by the elements themselves (at least one), or by the order from the location.

Locations are denoted by the symbol , where m is the number of all available elements, n is the number of elements in each combination. ( A- first letter of the French word arrangement, which means "placement, putting in order").

At the same time, it is assumed that nm.

The number of placements can be calculated using the formula

,

those. the number of all possible placements from m elements by n is equal to the product n consecutive integers, of which the greater is m.

We write this formula in factorial form:

Example 3. How many options for the distribution of three vouchers to a sanatorium of various profiles can be made for five applicants?

Solution. The desired number of options is equal to the number of placements of 5 elements by 3 elements, i.e.

.

Combinations.

Combinations are all possible combinations of m elements by n, which differ from each other by at least one element (here m And n- natural numbers, and n m).

Number of combinations from m elements by n are denoted ( WITH- the first letter of the French word combination- combination).

In general, the number of m elements by n equal to the number of placements from m elements by n divided by the number of permutations from n elements:

Using factorial formulas for placement and permutation numbers, we get:

Example 4. In a team of 25 people, you need to allocate four to work in a certain area. In how many ways can this be done?

Solution. Since the order of the chosen four people does not matter, this can be done in ways.

We find by the first formula

.

In addition, when solving problems, the following formulas are used that express the main properties of combinations:

(by definition, and are assumed);

.

1.2. Solving combinatorial problems

Task 1. 16 subjects are studied at the faculty. On Monday, you need to put 3 subjects in the schedule. In how many ways can this be done?

Solution. There are as many ways to schedule three items out of 16 as there are placements of 16 elements of 3 each.

Task 2. Out of 15 objects, 10 objects must be selected. In how many ways can this be done?

Task 3. Four teams participated in the competition. How many options for the distribution of seats between them are possible?

.

Problem 4. In how many ways can a patrol of three soldiers and one officer be formed if there are 80 soldiers and 3 officers?

Solution. Soldier on patrol can be selected

ways, and officers ways. Since any officer can go with each team of soldiers, there are only ways.

Task 5. Find if it is known that .

Since , we get

,

,

By definition of combination it follows that , . That. .

1.3. The concept of a random event. Event types. Event Probability

Any action, phenomenon, observation with several different outcomes, realized under a given set of conditions, will be called test.

The result of this action or observation is called event .

If an event under given conditions can occur or not occur, then it is called random . In the event that an event must certainly occur, it is called authentic , and in the case when it certainly cannot happen, - impossible.

The events are called incompatible if only one of them can appear each time.

The events are called joint if, under the given conditions, the occurrence of one of these events does not exclude the occurrence of the other in the same test.

The events are called opposite , if under the test conditions they, being its only outcomes, are incompatible.

Events are usually denoted by capital letters of the Latin alphabet: A, B, C, D, : .

A complete system of events A 1 , A 2 , A 3 , : , A n is a set of incompatible events, the occurrence of at least one of which is mandatory for a given test.

If a complete system consists of two incompatible events, then such events are called opposite and are denoted by A and .

Example. There are 30 numbered balls in a box. Determine which of the following events are impossible, certain, opposite:

got a numbered ball (A);

draw an even numbered ball (IN);

drawn a ball with an odd number (WITH);

got a ball without a number (D).

Which of them form a complete group?

Solution . A- certain event; D- impossible event;

In and WITH- opposite events.

The complete group of events is A And D, V And WITH.

The probability of an event is considered as a measure of the objective possibility of the occurrence of a random event.

1.4. The classical definition of probability

The number, which is an expression of the measure of the objective possibility of the occurrence of an event, is called probability this event and is denoted by the symbol P(A).

Definition. Probability of an event A is the ratio of the number of outcomes m that favor the occurrence of a given event A, to the number n all outcomes (incompatible, unique and equally possible), i.e. .

Therefore, in order to find the probability of an event, it is necessary, after considering the various outcomes of the test, to calculate all possible incompatible outcomes n, choose the number of outcomes we are interested in m and calculate the ratio m To n.

The following properties follow from this definition:

The probability of any trial is a non-negative number not exceeding one.

Indeed, the number m of the desired events lies within . Dividing both parts into n, we get

2. The probability of a certain event is equal to one, because .

3. The probability of an impossible event is zero because .

Problem 1. There are 200 winners out of 1000 tickets in the lottery. One ticket is drawn at random. What is the probability that this ticket wins?

Solution. The total number of different outcomes is n=1000. The number of outcomes favoring the winning is m=200. According to the formula, we get

.

Task 2. In a batch of 18 parts, there are 4 defective ones. 5 pieces are chosen at random. Find the probability that two out of these 5 parts will be defective.

Solution. Number of all equally possible independent outcomes n is equal to the number of combinations from 18 to 5 i.e.

Let's calculate the number m that favor event A. Among the 5 randomly selected parts, there should be 3 high-quality and 2 defective ones. The number of ways to select two defective parts from 4 available defective parts is equal to the number of combinations from 4 to 2:

The number of ways to select three quality parts from 14 available quality parts is equal to

.

Any group of quality parts can be combined with any group of defective parts, so the total number of combinations m is

The desired probability of the event A is equal to the ratio of the number of outcomes m that favor this event to the number n of all equally possible independent outcomes:

.

The sum of a finite number of events is an event consisting in the occurrence of at least one of them.

The sum of two events is denoted by the symbol A + B, and the sum n events symbol A 1 +A 2 + : +A n .

The theorem of addition of probabilities.

The probability of the sum of two incompatible events is equal to the sum of the probabilities of these events.

Corollary 1. If the event А 1 , А 2 , : , А n form a complete system, then the sum of the probabilities of these events is equal to one.

Corollary 2. The sum of the probabilities of opposite events and is equal to one.

.

Problem 1. There are 100 lottery tickets. It is known that 5 tickets get a win of 20,000 rubles, 10 - 15,000 rubles, 15 - 10,000 rubles, 25 - 2,000 rubles. and nothing for the rest. Find the probability that the purchased ticket will win at least 10,000 rubles.

Solution. Let A, B, and C be events consisting in the fact that a prize equal to 20,000, 15,000 and 10,000 rubles falls on the purchased ticket. since the events A, B and C are incompatible, then

Task 2. The correspondence department of the technical school receives tests in mathematics from cities A, B And WITH. The probability of receipt of control work from the city A equal to 0.6, from the city IN- 0.1. Find the probability that the next control work will come from the city WITH.