С - визначити, чи є число простим


75

Я намагаюся придумати метод, який приймає ціле число і повертає логічне значення, щоб сказати, чи є число простим чи ні, і я не знаю багато C; хтось піклується про те, щоб мені підказати?

В основному, я б зробив це в C # так:

static bool IsPrime(int number)
{
    for (int i = 2; i < number; i++)
    {
        if (number % i == 0 && i != number)
            return false;
    }
    return true;
}

1
This is more of a math question than a programming question, surely?
bdonlan

49
Here's some pointers: int *ptr; int *ptr2; int *ptr3. Sorry couldn't help it. How big are the numbers you will be checking? And also, do you want a heuristic or something that always works?
AlbertoPL

3
Come up with you algorithm (the way you test it without code) and then maybe we can help express it in C.
BobbyShaftoe

12
What's the point of 'i != number' when you have 'i < number' as a condition to execute the loop?
Matthieu M.

3
Also note that checking i < number is overkill. By definition, if a number x = a * b, either a or b is < int(sqrt(x)) and the other is greater. So your loop should only need to go up to int(sqrt(x)).
twalberg

Відповіді:


153

OK, so forget about C. Suppose I give you a number and ask you to determine if it's prime. How do you do it? Write down the steps clearly, then worry about translating them into code.

Once you have the algorithm determined, it will be much easier for you to figure out how to write a program, and for others to help you with it.

edit: Here's the C# code you posted:

static bool IsPrime(int number) {
    for (int i = 2; i < number; i++) {
        if (number % i == 0 && i != number) return false;
    }
    return true;
}

This is very nearly valid C as is; there's no bool type in C, and no true or false, so you need to modify it a little bit (edit: Kristopher Johnson correctly points out that C99 added the stdbool.h header). Since some people don't have access to a C99 environment (but you should use one!), let's make that very minor change:

int IsPrime(int number) {
    int i;
    for (i=2; i<number; i++) {
        if (number % i == 0 && i != number) return 0;
    }
    return 1;
}

This is a perfectly valid C program that does what you want. We can improve it a little bit without too much effort. First, note that i is always less than number, so the check that i != number always succeeds; we can get rid of it.

Also, you don't actually need to try divisors all the way up to number - 1; you can stop checking when you reach sqrt(number). Since sqrt is a floating-point operation and that brings a whole pile of subtleties, we won't actually compute sqrt(number). Instead, we can just check that i*i <= number:

int IsPrime(int number) {
    int i;
    for (i=2; i*i<=number; i++) {
        if (number % i == 0) return 0;
    }
    return 1;
}

One last thing, though; there was a small bug in your original algorithm! If number is negative, or zero, or one, this function will claim that the number is prime. You likely want to handle that properly, and you may want to make number be unsigned, since you're more likely to care about positive values only:

int IsPrime(unsigned int number) {
    if (number <= 1) return 0; // zero and one are not prime
    unsigned int i;
    for (i=2; i*i<=number; i++) {
        if (number % i == 0) return 0;
    }
    return 1;
}

This definitely isn't the fastest way to check if a number is prime, but it works, and it's pretty straightforward. We barely had to modify your code at all!


11
FYI, the C99 standard defines a <stdbool.h> header that provides bool, true, and false.
Kristopher Johnson

27
I know that it is simpler to compute a square than a square root, however computing a square on each iteration ought to cost MORE that computing the square root once and be done with it :x
Matthieu M.

6
On a modern out-of-order machine, the latency of the mul instruction to square i should be entirely hidden in the latency of the modulus, so there would be no appreciable performance win. On a strictly in-order machine, there is a win to be had using a hoisted square root, but that potentially raises issues of floating-point imprecision if the code were compiled on a platform with a large int type (64 bits or bigger). All that can be dealt with, but I thought it best to keep things simple and trivially portable. After all, if you care about speed, you're not using this algorithm at all.
Stephen Canon

5
@Tom you can improve a lot more by stopping at the floor(sqrt(number)). Take 11, for example, floor(sqrt(11)) = 3. The number after 3 is 4, 3*4 = 12 > 11. If you're using a naive sieve to check for primality, you only need to check odd numbers up to the sqrt of the original, aside from 2.
Calyth

3
-1. The final function gives the incorrect answer for 4294967291.
davidg

27

I'm suprised that no one mentioned this.

Use the Sieve Of Eratosthenes

Details:

  1. Basically nonprime numbers are divisible by another number besides 1 and themselves
  2. Therefore: a nonprime number will be a product of prime numbers.

The sieve of Eratosthenes finds a prime number and stores it. When a new number is checked for primeness all of the previous primes are checked against the know prime list.

Reasons:

  1. This algorithm/problem is known as "Embarrassingly Parallel"
  2. It creates a collection of prime numbers
  3. Its an example of a dynamic programming problem
  4. Its quick!

8
It's also O(n) in space, and as long as your computation is for a single value, this is a huge waste of space for no performance gain.
R.. GitHub STOP HELPING ICE

3
(Actually O(n log n) or larger if you're supporting large numbers...)
R.. GitHub STOP HELPING ICE

2
Who computes only 1 value for a prime for the life span of the application? Primes are a good candidate to be cached.
monksy

2
A command line program that terminates after one query would be an obvious example. In any case, keeping global state is ugly and should always be considered a trade-off. And I would go so far as to say the sieve (generated at runtime) is essentially useless. If your prime candidates are small enough that you can fit a sieve that size in memory, you should just have a static const bitmap of which numbers are prime and use that, rather than filling it at runtime.
R.. GitHub STOP HELPING ICE

1
Sieve of Eratosthenes is a good (well, good-ish) way of solving the problem "generate all of the primes up to n". It's a wasteful way of solving the problem "is n prime?"
hobbs

16

Stephen Canon answered it very well!

But

  • The algorithm can be improved further by observing that all primes are of the form 6k ± 1, with the exception of 2 and 3.
  • This is because all integers can be expressed as (6k + i) for some integer k and for i = −1, 0, 1, 2, 3, or 4; 2 divides (6k + 0), (6k + 2), (6k + 4); and 3 divides (6k + 3).
  • So a more efficient method is to test if n is divisible by 2 or 3, then to check through all the numbers of form 6k ± 1 ≤ √n.
  • This is 3 times as fast as testing all m up to √n.

    int IsPrime(unsigned int number) {
        if (number <= 3 && number > 1) 
            return 1;            // as 2 and 3 are prime
        else if (number%2==0 || number%3==0) 
            return 0;     // check if number is divisible by 2 or 3
        else {
            unsigned int i;
            for (i=5; i*i<=number; i+=6) {
                if (number % i == 0 || number%(i + 2) == 0) 
                    return 0;
            }
            return 1; 
        }
    }
    

2
you should return 0 when (number == 1), as 1 isn't a prime number.
Ahmad Ibrahim

These kind of optimisations are IMO irrelevant for this task : why stop at the form 6k ± 1 except 2 and 3, which revrites in n^2 mod 6 = 1, when you can have n^4 mod 30 = 1 except 2,3,5 ... in fact, you can go forever because you're using prime numbers to do this optimisation ... and this IS the very principle of the more general Sieve of Eratosthenes algorithm :)
ghilesZ

1
@GhilesZ: I disagree, this is very relevant to the problem and with a single "||" allows the basic loop to run effectively 3 times faster.
verdy_p

In addition for number==1 it is correctly returning 0 (non-prime) with the tested confition "(number%2==0)", si there's no bug at all
verdy_p

The Eratosthene methoid is a completely different method that requires allocating a large O(n) array of booleans and it will not necessarily be faster due to indexed accesses. This code is fine as it optimizes first the case of the two first primes 2 and 3 (that's why the loop steps by 2*3).
verdy_p

10
  1. Build a table of small primes, and check if they divide your input number.
  2. If the number survived to 1, try pseudo primality tests with increasing basis. See Miller-Rabin primality test for example.
  3. If your number survived to 2, you can conclude it is prime if it is below some well known bounds. Otherwise your answer will only be "probably prime". You will find some values for these bounds in the wiki page.

3
+1: complete overkill for what the questioner was asking, but correct nonetheless.
Stephen Canon

Note that Guy L. recently suggested using Miller-Rabin in an answer too, and linked to rosettacode.org/wiki/Miller-Rabin_primality_test#C — which shows an implementation in C using GMP. The entry also has a number of implementations in a wide variety of other languages too.
Jonathan Leffler

4

this program is much efficient for checking a single number for primality check.

bool check(int n){
    if (n <= 3) {
        return n > 1;
    }

    if (n % 2 == 0 || n % 3 == 0) {
        return false;
    }
        int sq=sqrt(n); //include math.h or use i*i<n in for loop
    for (int i = 5; i<=sq; i += 6) {
        if (n % i == 0 || n % (i + 2) == 0) {
            return false;
        }
    }

    return true;
}

1
To test a prime, you should go all the way from i=2 to i<=ceil(sqrt(n)). You missed 2 numbers in your test: First, cast to (int) makes sqrt(n) trunk the decimals. Second, you used i<sq, when it should be i<=sq. Now, suppose a number that fits this problem. A composite number n that has ceil(sqrt(n)) as the smaller factor. Your inner loop runs for i like: (5, 7), (11, 13), (17, 19), (23, 25), (29, 31), (35, 37), (41, 43), and so on, n%i and n%(i+2). Suppose we get sqrt(1763)=41.98. Being 1763=41*43 a composite number. Your loop will run only until (35, 37) and fail.
DrBeco

@DrBeco nice observation! thanks for example. updated the code.
GorvGoyl

2
After analyzing carefully the ceil() problem, I realized that although lots of sites recommend it, its just overkill. You can trunk and test just i<=sqrt(n) and it will be ok. The test cases are large tween primes. Example: 86028221*86028223=7400854980481283 and sqrt(7400854980481283)~86028222. And the smaller know tween primes, 2 and 3, gives sqrt(6)=2.449 that trunked will still leave 2. (But smaller is not a test case, just a comparison to make a point). So, yes, the algorithm is correct now. No need to use ceil().
DrBeco

3

Check the modulus of each integer from 2 up to the root of the number you're checking.

If modulus equals zero then it's not prime.

pseudo code:

bool IsPrime(int target)
{
  for (i = 2; i <= root(target); i++)
  {
    if ((target mod i) == 0)
    {
      return false;
    }
  }

  return true;
}

2
Of course, the downside is that the the sqrt is calculated on every iteration, which will slow it down a lot.
Rich Bradshaw

9
Any reasonable compiler should be able to detect that root(target) is a loop invariant and hoist it.
Stephen Canon

1
(and if you have a compiler that can't do that optimization, you should absolutely file a bug to let the compiler writer know that they're missing this optimization.)
Stephen Canon

along with many other potential (micro)optimistations, If you manually get the sqrt before the for statement you can check the mod of that as well (and return false if 0).
Matt Lacey

2
What if the target value is 1?
ffffff01

3

After reading this question, I was intrigued by the fact that some answers offered optimization by running a loop with multiples of 2*3=6.

So I create a new function with the same idea, but with multiples of 2*3*5=30.

int check235(unsigned long n)
{
    unsigned long sq, i;

    if(n<=3||n==5)
        return n>1;

    if(n%2==0 || n%3==0 || n%5==0)
        return 0;

    if(n<=30)
        return checkprime(n); /* use another simplified function */

    sq=ceil(sqrt(n));
    for(i=7; i<=sq; i+=30)
        if (n%i==0 || n%(i+4)==0 || n%(i+6)==0 || n%(i+10)==0 || n%(i+12)==0 
           || n%(i+16)==0 || n%(i+22)==0 || n%(i+24)==0)
            return 0;

        return 1;
}

By running both functions and checking times I could state that this function is really faster. Lets see 2 tests with 2 different primes:

$ time ./testprimebool.x 18446744069414584321 0
f(2,3)
Yes, its prime.    
real    0m14.090s
user    0m14.096s
sys     0m0.000s

$ time ./testprimebool.x 18446744069414584321 1
f(2,3,5)
Yes, its prime.    
real    0m9.961s
user    0m9.964s
sys     0m0.000s

$ time ./testprimebool.x 18446744065119617029 0
f(2,3)
Yes, its prime.    
real    0m13.990s
user    0m13.996s
sys     0m0.004s

$ time ./testprimebool.x 18446744065119617029 1
f(2,3,5)
Yes, its prime.    
real    0m10.077s
user    0m10.068s
sys     0m0.004s

So I thought, would someone gain too much if generalized? I came up with a function that will do a siege first to clean a given list of primordial primes, and then use this list to calculate the bigger one.

int checkn(unsigned long n, unsigned long *p, unsigned long t)
{
    unsigned long sq, i, j, qt=1, rt=0;
    unsigned long *q, *r;

    if(n<2)
        return 0;

    for(i=0; i<t; i++)
    {
        if(n%p[i]==0)
            return 0;
        qt*=p[i];
    }
    qt--;

    if(n<=qt)
        return checkprime(n); /* use another simplified function */

    if((q=calloc(qt, sizeof(unsigned long)))==NULL)
    {
        perror("q=calloc()");
        exit(1);
    }
    for(i=0; i<t; i++)
        for(j=p[i]-2; j<qt; j+=p[i])
            q[j]=1;

    for(j=0; j<qt; j++)
        if(q[j])
            rt++;

    rt=qt-rt;
    if((r=malloc(sizeof(unsigned long)*rt))==NULL)
    {
        perror("r=malloc()");
        exit(1);
    }
    i=0;
    for(j=0; j<qt; j++)
        if(!q[j])
            r[i++]=j+1;

    free(q);

    sq=ceil(sqrt(n));
    for(i=1; i<=sq; i+=qt+1)
    {
        if(i!=1 && n%i==0)
            return 0;
        for(j=0; j<rt; j++)
            if(n%(i+r[j])==0)
                return 0;
    }
    return 1;
}

I assume I did not optimize the code, but it's fair. Now, the tests. Because so many dynamic memory, I expected the list 2 3 5 to be a little slower than the 2 3 5 hard-coded. But it was ok as you can see bellow. After that, time got smaller and smaller, culminating the best list to be:

2 3 5 7 11 13 17 19

With 8.6 seconds. So if someone would create a hardcoded program that makes use of such technique I would suggest use the list 2 3 and 5, because the gain is not that big. But also, if willing to code, this list is ok. Problem is you cannot state all cases without a loop, or your code would be very big (There would be 1658879 ORs, that is || in the respective internal if). The next list:

2 3 5 7 11 13 17 19 23

time started to get bigger, with 13 seconds. Here the whole test:

$ time ./testprimebool.x 18446744065119617029 2 3 5
f(2,3,5)
Yes, its prime.
real    0m12.668s
user    0m12.680s
sys     0m0.000s

$ time ./testprimebool.x 18446744065119617029 2 3 5 7
f(2,3,5,7)
Yes, its prime.
real    0m10.889s
user    0m10.900s
sys     0m0.000s

$ time ./testprimebool.x 18446744065119617029 2 3 5 7 11
f(2,3,5,7,11)
Yes, its prime.
real    0m10.021s
user    0m10.028s
sys     0m0.000s

$ time ./testprimebool.x 18446744065119617029 2 3 5 7 11 13
f(2,3,5,7,11,13)
Yes, its prime.
real    0m9.351s
user    0m9.356s
sys     0m0.004s

$ time ./testprimebool.x 18446744065119617029 2 3 5 7 11 13 17
f(2,3,5,7,11,13,17)
Yes, its prime.
real    0m8.802s
user    0m8.800s
sys     0m0.008s

$ time ./testprimebool.x 18446744065119617029 2 3 5 7 11 13 17 19
f(2,3,5,7,11,13,17,19)
Yes, its prime.
real    0m8.614s
user    0m8.564s
sys     0m0.052s

$ time ./testprimebool.x 18446744065119617029 2 3 5 7 11 13 17 19 23
f(2,3,5,7,11,13,17,19,23)
Yes, its prime.
real    0m13.013s
user    0m12.520s
sys     0m0.504s

$ time ./testprimebool.x 18446744065119617029 2 3 5 7 11 13 17 19 23 29
f(2,3,5,7,11,13,17,19,23,29)                                                                                                                         
q=calloc(): Cannot allocate memory

PS. I did not free(r) intentionally, giving this task to the OS, as the memory would be freed as soon as the program exited, to gain some time. But it would be wise to free it if you intend to keep running your code after the calculation.


BONUS

int check2357(unsigned long n)
{
    unsigned long sq, i;

    if(n<=3||n==5||n==7)
        return n>1;

    if(n%2==0 || n%3==0 || n%5==0 || n%7==0)
        return 0;

    if(n<=210)
        return checkprime(n); /* use another simplified function */

    sq=ceil(sqrt(n));
    for(i=11; i<=sq; i+=210)
    {    
        if(n%i==0 || n%(i+2)==0 || n%(i+6)==0 || n%(i+8)==0 || n%(i+12)==0 || 
   n%(i+18)==0 || n%(i+20)==0 || n%(i+26)==0 || n%(i+30)==0 || n%(i+32)==0 || 
   n%(i+36)==0 || n%(i+42)==0 || n%(i+48)==0 || n%(i+50)==0 || n%(i+56)==0 || 
   n%(i+60)==0 || n%(i+62)==0 || n%(i+68)==0 || n%(i+72)==0 || n%(i+78)==0 || 
   n%(i+86)==0 || n%(i+90)==0 || n%(i+92)==0 || n%(i+96)==0 || n%(i+98)==0 || 
   n%(i+102)==0 || n%(i+110)==0 || n%(i+116)==0 || n%(i+120)==0 || n%(i+126)==0 || 
   n%(i+128)==0 || n%(i+132)==0 || n%(i+138)==0 || n%(i+140)==0 || n%(i+146)==0 || 
   n%(i+152)==0 || n%(i+156)==0 || n%(i+158)==0 || n%(i+162)==0 || n%(i+168)==0 || 
   n%(i+170)==0 || n%(i+176)==0 || n%(i+180)==0 || n%(i+182)==0 || n%(i+186)==0 || 
   n%(i+188)==0 || n%(i+198)==0)
            return 0;
    }
    return 1;
}

Time:

$ time ./testprimebool.x 18446744065119617029 7
h(2,3,5,7)
Yes, its prime.
real    0m9.123s
user    0m9.132s
sys     0m0.000s

Bonus: 101-199 primals all fail here because 101 % (11+90).
vp_arth

1
need to stop at n%(i+86) or check n > i+k
vp_arth

Well done, sir. I'll take a look. Thank you. The same problem happens with function check235() for primes 7, 11, 13, 17, 19, 23 and 29
DrBeco

Solution: need to move these reminders to an array, iterate through and break iteration when i+arr[k] >= n
vp_arth

I thought about that, but I don't want a array, because of the if with constants can be better optimized by the compiler. I edited to add an exception and keep the current structure intact. But I agree, with an array can be better to human eyes.
DrBeco

2

I would just add that no even number (bar 2) can be a prime number. This results in another condition prior to for loop. So the end code should look like this:

int IsPrime(unsigned int number) {
    if (number <= 1) return 0; // zero and one are not prime
    if ((number > 2) && ((number % 2) == 0)) return 0; //no even number is prime number (bar 2)
    unsigned int i;
    for (i=2; i*i<=number; i++) {
        if (number % i == 0) return 0;
    }
    return 1;
}

1
int is_prime(int val)
{
   int div,square;

   if (val==2) return TRUE;    /* 2 is prime */
   if ((val&1)==0) return FALSE;    /* any other even number is not */

   div=3;
   square=9;    /* 3*3 */
   while (square<val)
   {
     if (val % div == 0) return FALSE;    /* evenly divisible */
     div+=2;
     square=div*div;
   }
   if (square==val) return FALSE;
   return TRUE;
}

Handling of 2 and even numbers are kept out of the main loop which only handles odd numbers divided by odd numbers. This is because an odd number modulo an even number will always give a non-zero answer which makes those tests redundant. Or, to put it another way, an odd number may be evenly divisible by another odd number but never by an even number (E*E=>E, E*O=>E, O*E=>E and O*O=>O).

A division/modulus is really costly on the x86 architecture although how costly varies (see http://gmplib.org/~tege/x86-timing.pdf). Multiplications on the other hand are quite cheap.


1

Avoid overflow bug

unsigned i, number;
...
for (i=2; i*i<=number; i++) {  // Buggy
for (i=2; i*i<=number; i += 2) {  // Buggy
// or
for (i=5; i*i<=number; i+=6) { // Buggy

These forms are incorrect when number is a prime and i*i is near the max value of the type.

Problem exists with all integer types, signed, unsigned and wider.

Example:

Let UINT_MAX_SQRT as the floor of the square root of the maximum integer value. E.g. 65535 when unsigned is 32-bit.

With for (i=2; i*i<=number; i++), this 10-year old failure occurs because when UINT_MAX_SQRT*UINT_MAX_SQRT <= number and number is a prime, the next iteration results in a multiplication overflow. Had the type been a signed type, the overflow is UB. With unsigned types, this itself is not UB, yet logic has broken down. Interations continue until a truncated product exceeds number. An incorrect result may occur. With 32-bit unsigned, try 4,294,967,291‬ which is a prime.

If some_integer_type_MAX been a Mersenne Prime, i*i<=number is never true.


To avoid this bug, consider that number%i, number/i is efficient on many compilers as the calculations of the quotient and remainder are done together, thus incurring no extra cost to do both vs. just 1.

A simple full-range solution:

bool IsPrime(unsigned number) {
    for(unsigned i = 2; i <= number/i; i++){
        if(number % i == 0){
            return false;
        }
    }
    return number >= 2;
}

0

Using Sieve of Eratosthenes, computation is quite faster compare to "known-wide" prime numbers algorithm.

By using pseudocode from it's wiki (https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes), I be able to have the solution on C#.

public bool IsPrimeNumber(int val) {
    // Using Sieve of Eratosthenes.
    if (val < 2)
    {
        return false;
    }

    // Reserve place for val + 1 and set with true.
    var mark = new bool[val + 1];
    for(var i = 2; i <= val; i++)
    {
        mark[i] = true;
    }

    // Iterate from 2 ... sqrt(val).
    for (var i = 2; i <= Math.Sqrt(val); i++)
    {
        if (mark[i])
        {
            // Cross out every i-th number in the places after i (all the multiples of i).
            for (var j = (i * i); j <= val; j += i)
            {
                mark[j] = false;
            }
        }
    }

    return mark[val];
}

IsPrimeNumber(1000000000) takes 21s 758ms.

NOTE: Value might vary depend on hardware specifications.

Використовуючи наш веб-сайт, ви визнаєте, що прочитали та зрозуміли наші Політику щодо файлів cookie та Політику конфіденційності.
Licensed under cc by-sa 3.0 with attribution required.