select RAND() with probability












1















I want to display 10 lines of the "questions" table with probability 0.2 of question that have type_id = 1 and probability 0.8 of question that have type_id =2.
Below my request, how to add the probability?



$query = "select * from questions ORDER BY RAND() LIMIT 10";


enter image description here



I want to display 10 questions which 20% of the questions have type_id = 2 and 80% have type_id = 1.



can someone help me please?










share|improve this question




















  • 3





    ORDER BY RAND() is a very basic feature. If you need anything beyond a basic random order (or if you need half-decent performance), you'll need to forget about using it, and switch to a more nuanced approach.

    – Spudley
    Nov 26 '18 at 11:22











  • @Spudley,can you help me to resolved it?

    – sonia
    Nov 27 '18 at 8:30











  • Given the change in the question, why not just run two queries; one for the type 1 questions and another for the type 2.

    – Spudley
    Nov 27 '18 at 8:43











  • I tried this request but unfortunately does not work: SELECT * FROM questions WHERE type_id= 2 ORDER BY rand() LIMIT 2 UNION SELECT * FROM questions WHERE type_id= 1 ORDER BY rand() LIMIT 8 ORDER BY rand()

    – sonia
    Nov 27 '18 at 8:59













  • You've doubled-up on the ORDER BY rand(). Get rid of the last one on the end of the query and it should work. If you're trying to shuffle the two sets together, you may need brackets around the different bits of the query, but honestly if it's only ten records, you're probably better off loading the two sets into a PHP array and randomising the order in PHP.

    – Spudley
    Nov 27 '18 at 9:09
















1















I want to display 10 lines of the "questions" table with probability 0.2 of question that have type_id = 1 and probability 0.8 of question that have type_id =2.
Below my request, how to add the probability?



$query = "select * from questions ORDER BY RAND() LIMIT 10";


enter image description here



I want to display 10 questions which 20% of the questions have type_id = 2 and 80% have type_id = 1.



can someone help me please?










share|improve this question




















  • 3





    ORDER BY RAND() is a very basic feature. If you need anything beyond a basic random order (or if you need half-decent performance), you'll need to forget about using it, and switch to a more nuanced approach.

    – Spudley
    Nov 26 '18 at 11:22











  • @Spudley,can you help me to resolved it?

    – sonia
    Nov 27 '18 at 8:30











  • Given the change in the question, why not just run two queries; one for the type 1 questions and another for the type 2.

    – Spudley
    Nov 27 '18 at 8:43











  • I tried this request but unfortunately does not work: SELECT * FROM questions WHERE type_id= 2 ORDER BY rand() LIMIT 2 UNION SELECT * FROM questions WHERE type_id= 1 ORDER BY rand() LIMIT 8 ORDER BY rand()

    – sonia
    Nov 27 '18 at 8:59













  • You've doubled-up on the ORDER BY rand(). Get rid of the last one on the end of the query and it should work. If you're trying to shuffle the two sets together, you may need brackets around the different bits of the query, but honestly if it's only ten records, you're probably better off loading the two sets into a PHP array and randomising the order in PHP.

    – Spudley
    Nov 27 '18 at 9:09














1












1








1








I want to display 10 lines of the "questions" table with probability 0.2 of question that have type_id = 1 and probability 0.8 of question that have type_id =2.
Below my request, how to add the probability?



$query = "select * from questions ORDER BY RAND() LIMIT 10";


enter image description here



I want to display 10 questions which 20% of the questions have type_id = 2 and 80% have type_id = 1.



can someone help me please?










share|improve this question
















I want to display 10 lines of the "questions" table with probability 0.2 of question that have type_id = 1 and probability 0.8 of question that have type_id =2.
Below my request, how to add the probability?



$query = "select * from questions ORDER BY RAND() LIMIT 10";


enter image description here



I want to display 10 questions which 20% of the questions have type_id = 2 and 80% have type_id = 1.



can someone help me please?







php select mysqli phpmyadmin






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 27 '18 at 8:28







sonia

















asked Nov 23 '18 at 14:37









soniasonia

124




124








  • 3





    ORDER BY RAND() is a very basic feature. If you need anything beyond a basic random order (or if you need half-decent performance), you'll need to forget about using it, and switch to a more nuanced approach.

    – Spudley
    Nov 26 '18 at 11:22











  • @Spudley,can you help me to resolved it?

    – sonia
    Nov 27 '18 at 8:30











  • Given the change in the question, why not just run two queries; one for the type 1 questions and another for the type 2.

    – Spudley
    Nov 27 '18 at 8:43











  • I tried this request but unfortunately does not work: SELECT * FROM questions WHERE type_id= 2 ORDER BY rand() LIMIT 2 UNION SELECT * FROM questions WHERE type_id= 1 ORDER BY rand() LIMIT 8 ORDER BY rand()

    – sonia
    Nov 27 '18 at 8:59













  • You've doubled-up on the ORDER BY rand(). Get rid of the last one on the end of the query and it should work. If you're trying to shuffle the two sets together, you may need brackets around the different bits of the query, but honestly if it's only ten records, you're probably better off loading the two sets into a PHP array and randomising the order in PHP.

    – Spudley
    Nov 27 '18 at 9:09














  • 3





    ORDER BY RAND() is a very basic feature. If you need anything beyond a basic random order (or if you need half-decent performance), you'll need to forget about using it, and switch to a more nuanced approach.

    – Spudley
    Nov 26 '18 at 11:22











  • @Spudley,can you help me to resolved it?

    – sonia
    Nov 27 '18 at 8:30











  • Given the change in the question, why not just run two queries; one for the type 1 questions and another for the type 2.

    – Spudley
    Nov 27 '18 at 8:43











  • I tried this request but unfortunately does not work: SELECT * FROM questions WHERE type_id= 2 ORDER BY rand() LIMIT 2 UNION SELECT * FROM questions WHERE type_id= 1 ORDER BY rand() LIMIT 8 ORDER BY rand()

    – sonia
    Nov 27 '18 at 8:59













  • You've doubled-up on the ORDER BY rand(). Get rid of the last one on the end of the query and it should work. If you're trying to shuffle the two sets together, you may need brackets around the different bits of the query, but honestly if it's only ten records, you're probably better off loading the two sets into a PHP array and randomising the order in PHP.

    – Spudley
    Nov 27 '18 at 9:09








3




3





ORDER BY RAND() is a very basic feature. If you need anything beyond a basic random order (or if you need half-decent performance), you'll need to forget about using it, and switch to a more nuanced approach.

– Spudley
Nov 26 '18 at 11:22





ORDER BY RAND() is a very basic feature. If you need anything beyond a basic random order (or if you need half-decent performance), you'll need to forget about using it, and switch to a more nuanced approach.

– Spudley
Nov 26 '18 at 11:22













@Spudley,can you help me to resolved it?

– sonia
Nov 27 '18 at 8:30





@Spudley,can you help me to resolved it?

– sonia
Nov 27 '18 at 8:30













Given the change in the question, why not just run two queries; one for the type 1 questions and another for the type 2.

– Spudley
Nov 27 '18 at 8:43





Given the change in the question, why not just run two queries; one for the type 1 questions and another for the type 2.

– Spudley
Nov 27 '18 at 8:43













I tried this request but unfortunately does not work: SELECT * FROM questions WHERE type_id= 2 ORDER BY rand() LIMIT 2 UNION SELECT * FROM questions WHERE type_id= 1 ORDER BY rand() LIMIT 8 ORDER BY rand()

– sonia
Nov 27 '18 at 8:59







I tried this request but unfortunately does not work: SELECT * FROM questions WHERE type_id= 2 ORDER BY rand() LIMIT 2 UNION SELECT * FROM questions WHERE type_id= 1 ORDER BY rand() LIMIT 8 ORDER BY rand()

– sonia
Nov 27 '18 at 8:59















You've doubled-up on the ORDER BY rand(). Get rid of the last one on the end of the query and it should work. If you're trying to shuffle the two sets together, you may need brackets around the different bits of the query, but honestly if it's only ten records, you're probably better off loading the two sets into a PHP array and randomising the order in PHP.

– Spudley
Nov 27 '18 at 9:09





You've doubled-up on the ORDER BY rand(). Get rid of the last one on the end of the query and it should work. If you're trying to shuffle the two sets together, you may need brackets around the different bits of the query, but honestly if it's only ten records, you're probably better off loading the two sets into a PHP array and randomising the order in PHP.

– Spudley
Nov 27 '18 at 9:09












1 Answer
1






active

oldest

votes


















1














As I noted in the comments, you won't be able to use anything as obvious as ORDER BY RAND() if you want to include probabilities or anything like that. ORDER BY RAND() simply doesn't support that kind of thing. ORDER BY RAND() is also very slow, and not really suitable for use on a database of any significant size anyway.



There are a whole bunch of approaches you can use to do a random sort order with weighting or probabilities; I'm not going to try to discuss them all; I'll just give you a relatively simple one, but please be aware that the best technique for you will depend on your specific use case.



A simple approach would be something like this:




  1. Create a new integer field on your table called weight or something similar.

  2. Add a DB index for this field to enable you to query it quickly.

  3. Set the first record to a value equal to its weighting as a whole number. ie a probability of 0.2 could be a weight of 20.

  4. Set each subsequent record to the max value of this field plus the weight for that record. So if the second record is also 0.2, it would get a value of 40; if the one after that is only 0.1, it would be 50; and so on.

  5. Do likewise for any new records that get added.


Now you can select a random record, with different weights for each record, as follows:



SELECT * FROM questions 
WHERE weight >= FLOOR(RAND() * (SELECT MAX(weight) FROM questions))
ORDER BY weight
LIMIT 1


(note, I'm writing is answer in a hurry and without resource to test it; I haven't run this query so I may have got the syntax wrong, but the basic technique is sound)



This will pick a random number between zero and the largest weight value, and then find the question record that has the closest weight value to that random number.



Also, because the weight field is indexed, this query will be quick and efficient.



Downsides of this technique: It assumes that the weights for any given record won't change. If the weight of a record does need to change, then you would have to update the weight value for every record after it in the index.



[EDIT]



Let's imagine a table like this:



id   Name
1 Question One
2 Question Two
3 Question Three
4 Question Four
5 Question Five


In this example, we want Questions 1 and 2 to have a probability of 0.2, question 3 to have a probability of 0.1 and questions 4 and 5 to have a probability of 0.3. Those probabilities can be expressed as integers by multiplying them by 100. (multiply by 10 also works, but 100 means we can have probabilities like 0.15 as well)



We add the weight column and the index for it, and set the weight values as follows:



id   Name            Weight
1 Question One 20
2 Question Two 40 (ie previous value + 20)
3 Question Three 50 (ie previous value + 10)
4 Question Four 80 (ie previous value + 30)
5 Question Five 110 (ie previous value + 30)


Now we can run our query.



The random part of the query FLOOR(RAND() * (SELECT MAX(weight) FROM questions)) will select a value between zero and 110. Let's imagine it gives 68.



Now the rest of our query says to pick the first record where the weight is greater than 68. In this case, that means that the record we get is record #4.



This gives us our probability because the random number could be anything, but is more likely to select a given record if the gap between its weight and the one before it is larger. You'll get record #4 three times as often as record #3.






share|improve this answer


























  • thank you for your reply but I do not understand points 2,3 and 4

    – sonia
    Nov 26 '18 at 12:34











  • Re point 2: Add an index... indexes are a whole other topic, but a very important one if you're doing database work. It's too much to explain the whole concept of indexes here, but I strongly suggest taking a bit of time to learn them. There are lots of resources to learn from if you search. Re points 3,4,5: What about these points do you not understand? I've tried to explain it clearly. I will add an example to the end, but tell me if there's more I can do to explain it.

    – Spudley
    Nov 26 '18 at 12:49











  • Thank for your effort.I think that's not what I'm looking for. I modified my post to be clearer.

    – sonia
    Nov 26 '18 at 13:39











  • Fair enough. The update to the question definitely changes things. I'll leave this answer in place as it may help others.

    – Spudley
    Nov 26 '18 at 13:42











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53448653%2fselect-rand-with-probability%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














As I noted in the comments, you won't be able to use anything as obvious as ORDER BY RAND() if you want to include probabilities or anything like that. ORDER BY RAND() simply doesn't support that kind of thing. ORDER BY RAND() is also very slow, and not really suitable for use on a database of any significant size anyway.



There are a whole bunch of approaches you can use to do a random sort order with weighting or probabilities; I'm not going to try to discuss them all; I'll just give you a relatively simple one, but please be aware that the best technique for you will depend on your specific use case.



A simple approach would be something like this:




  1. Create a new integer field on your table called weight or something similar.

  2. Add a DB index for this field to enable you to query it quickly.

  3. Set the first record to a value equal to its weighting as a whole number. ie a probability of 0.2 could be a weight of 20.

  4. Set each subsequent record to the max value of this field plus the weight for that record. So if the second record is also 0.2, it would get a value of 40; if the one after that is only 0.1, it would be 50; and so on.

  5. Do likewise for any new records that get added.


Now you can select a random record, with different weights for each record, as follows:



SELECT * FROM questions 
WHERE weight >= FLOOR(RAND() * (SELECT MAX(weight) FROM questions))
ORDER BY weight
LIMIT 1


(note, I'm writing is answer in a hurry and without resource to test it; I haven't run this query so I may have got the syntax wrong, but the basic technique is sound)



This will pick a random number between zero and the largest weight value, and then find the question record that has the closest weight value to that random number.



Also, because the weight field is indexed, this query will be quick and efficient.



Downsides of this technique: It assumes that the weights for any given record won't change. If the weight of a record does need to change, then you would have to update the weight value for every record after it in the index.



[EDIT]



Let's imagine a table like this:



id   Name
1 Question One
2 Question Two
3 Question Three
4 Question Four
5 Question Five


In this example, we want Questions 1 and 2 to have a probability of 0.2, question 3 to have a probability of 0.1 and questions 4 and 5 to have a probability of 0.3. Those probabilities can be expressed as integers by multiplying them by 100. (multiply by 10 also works, but 100 means we can have probabilities like 0.15 as well)



We add the weight column and the index for it, and set the weight values as follows:



id   Name            Weight
1 Question One 20
2 Question Two 40 (ie previous value + 20)
3 Question Three 50 (ie previous value + 10)
4 Question Four 80 (ie previous value + 30)
5 Question Five 110 (ie previous value + 30)


Now we can run our query.



The random part of the query FLOOR(RAND() * (SELECT MAX(weight) FROM questions)) will select a value between zero and 110. Let's imagine it gives 68.



Now the rest of our query says to pick the first record where the weight is greater than 68. In this case, that means that the record we get is record #4.



This gives us our probability because the random number could be anything, but is more likely to select a given record if the gap between its weight and the one before it is larger. You'll get record #4 three times as often as record #3.






share|improve this answer


























  • thank you for your reply but I do not understand points 2,3 and 4

    – sonia
    Nov 26 '18 at 12:34











  • Re point 2: Add an index... indexes are a whole other topic, but a very important one if you're doing database work. It's too much to explain the whole concept of indexes here, but I strongly suggest taking a bit of time to learn them. There are lots of resources to learn from if you search. Re points 3,4,5: What about these points do you not understand? I've tried to explain it clearly. I will add an example to the end, but tell me if there's more I can do to explain it.

    – Spudley
    Nov 26 '18 at 12:49











  • Thank for your effort.I think that's not what I'm looking for. I modified my post to be clearer.

    – sonia
    Nov 26 '18 at 13:39











  • Fair enough. The update to the question definitely changes things. I'll leave this answer in place as it may help others.

    – Spudley
    Nov 26 '18 at 13:42
















1














As I noted in the comments, you won't be able to use anything as obvious as ORDER BY RAND() if you want to include probabilities or anything like that. ORDER BY RAND() simply doesn't support that kind of thing. ORDER BY RAND() is also very slow, and not really suitable for use on a database of any significant size anyway.



There are a whole bunch of approaches you can use to do a random sort order with weighting or probabilities; I'm not going to try to discuss them all; I'll just give you a relatively simple one, but please be aware that the best technique for you will depend on your specific use case.



A simple approach would be something like this:




  1. Create a new integer field on your table called weight or something similar.

  2. Add a DB index for this field to enable you to query it quickly.

  3. Set the first record to a value equal to its weighting as a whole number. ie a probability of 0.2 could be a weight of 20.

  4. Set each subsequent record to the max value of this field plus the weight for that record. So if the second record is also 0.2, it would get a value of 40; if the one after that is only 0.1, it would be 50; and so on.

  5. Do likewise for any new records that get added.


Now you can select a random record, with different weights for each record, as follows:



SELECT * FROM questions 
WHERE weight >= FLOOR(RAND() * (SELECT MAX(weight) FROM questions))
ORDER BY weight
LIMIT 1


(note, I'm writing is answer in a hurry and without resource to test it; I haven't run this query so I may have got the syntax wrong, but the basic technique is sound)



This will pick a random number between zero and the largest weight value, and then find the question record that has the closest weight value to that random number.



Also, because the weight field is indexed, this query will be quick and efficient.



Downsides of this technique: It assumes that the weights for any given record won't change. If the weight of a record does need to change, then you would have to update the weight value for every record after it in the index.



[EDIT]



Let's imagine a table like this:



id   Name
1 Question One
2 Question Two
3 Question Three
4 Question Four
5 Question Five


In this example, we want Questions 1 and 2 to have a probability of 0.2, question 3 to have a probability of 0.1 and questions 4 and 5 to have a probability of 0.3. Those probabilities can be expressed as integers by multiplying them by 100. (multiply by 10 also works, but 100 means we can have probabilities like 0.15 as well)



We add the weight column and the index for it, and set the weight values as follows:



id   Name            Weight
1 Question One 20
2 Question Two 40 (ie previous value + 20)
3 Question Three 50 (ie previous value + 10)
4 Question Four 80 (ie previous value + 30)
5 Question Five 110 (ie previous value + 30)


Now we can run our query.



The random part of the query FLOOR(RAND() * (SELECT MAX(weight) FROM questions)) will select a value between zero and 110. Let's imagine it gives 68.



Now the rest of our query says to pick the first record where the weight is greater than 68. In this case, that means that the record we get is record #4.



This gives us our probability because the random number could be anything, but is more likely to select a given record if the gap between its weight and the one before it is larger. You'll get record #4 three times as often as record #3.






share|improve this answer


























  • thank you for your reply but I do not understand points 2,3 and 4

    – sonia
    Nov 26 '18 at 12:34











  • Re point 2: Add an index... indexes are a whole other topic, but a very important one if you're doing database work. It's too much to explain the whole concept of indexes here, but I strongly suggest taking a bit of time to learn them. There are lots of resources to learn from if you search. Re points 3,4,5: What about these points do you not understand? I've tried to explain it clearly. I will add an example to the end, but tell me if there's more I can do to explain it.

    – Spudley
    Nov 26 '18 at 12:49











  • Thank for your effort.I think that's not what I'm looking for. I modified my post to be clearer.

    – sonia
    Nov 26 '18 at 13:39











  • Fair enough. The update to the question definitely changes things. I'll leave this answer in place as it may help others.

    – Spudley
    Nov 26 '18 at 13:42














1












1








1







As I noted in the comments, you won't be able to use anything as obvious as ORDER BY RAND() if you want to include probabilities or anything like that. ORDER BY RAND() simply doesn't support that kind of thing. ORDER BY RAND() is also very slow, and not really suitable for use on a database of any significant size anyway.



There are a whole bunch of approaches you can use to do a random sort order with weighting or probabilities; I'm not going to try to discuss them all; I'll just give you a relatively simple one, but please be aware that the best technique for you will depend on your specific use case.



A simple approach would be something like this:




  1. Create a new integer field on your table called weight or something similar.

  2. Add a DB index for this field to enable you to query it quickly.

  3. Set the first record to a value equal to its weighting as a whole number. ie a probability of 0.2 could be a weight of 20.

  4. Set each subsequent record to the max value of this field plus the weight for that record. So if the second record is also 0.2, it would get a value of 40; if the one after that is only 0.1, it would be 50; and so on.

  5. Do likewise for any new records that get added.


Now you can select a random record, with different weights for each record, as follows:



SELECT * FROM questions 
WHERE weight >= FLOOR(RAND() * (SELECT MAX(weight) FROM questions))
ORDER BY weight
LIMIT 1


(note, I'm writing is answer in a hurry and without resource to test it; I haven't run this query so I may have got the syntax wrong, but the basic technique is sound)



This will pick a random number between zero and the largest weight value, and then find the question record that has the closest weight value to that random number.



Also, because the weight field is indexed, this query will be quick and efficient.



Downsides of this technique: It assumes that the weights for any given record won't change. If the weight of a record does need to change, then you would have to update the weight value for every record after it in the index.



[EDIT]



Let's imagine a table like this:



id   Name
1 Question One
2 Question Two
3 Question Three
4 Question Four
5 Question Five


In this example, we want Questions 1 and 2 to have a probability of 0.2, question 3 to have a probability of 0.1 and questions 4 and 5 to have a probability of 0.3. Those probabilities can be expressed as integers by multiplying them by 100. (multiply by 10 also works, but 100 means we can have probabilities like 0.15 as well)



We add the weight column and the index for it, and set the weight values as follows:



id   Name            Weight
1 Question One 20
2 Question Two 40 (ie previous value + 20)
3 Question Three 50 (ie previous value + 10)
4 Question Four 80 (ie previous value + 30)
5 Question Five 110 (ie previous value + 30)


Now we can run our query.



The random part of the query FLOOR(RAND() * (SELECT MAX(weight) FROM questions)) will select a value between zero and 110. Let's imagine it gives 68.



Now the rest of our query says to pick the first record where the weight is greater than 68. In this case, that means that the record we get is record #4.



This gives us our probability because the random number could be anything, but is more likely to select a given record if the gap between its weight and the one before it is larger. You'll get record #4 three times as often as record #3.






share|improve this answer















As I noted in the comments, you won't be able to use anything as obvious as ORDER BY RAND() if you want to include probabilities or anything like that. ORDER BY RAND() simply doesn't support that kind of thing. ORDER BY RAND() is also very slow, and not really suitable for use on a database of any significant size anyway.



There are a whole bunch of approaches you can use to do a random sort order with weighting or probabilities; I'm not going to try to discuss them all; I'll just give you a relatively simple one, but please be aware that the best technique for you will depend on your specific use case.



A simple approach would be something like this:




  1. Create a new integer field on your table called weight or something similar.

  2. Add a DB index for this field to enable you to query it quickly.

  3. Set the first record to a value equal to its weighting as a whole number. ie a probability of 0.2 could be a weight of 20.

  4. Set each subsequent record to the max value of this field plus the weight for that record. So if the second record is also 0.2, it would get a value of 40; if the one after that is only 0.1, it would be 50; and so on.

  5. Do likewise for any new records that get added.


Now you can select a random record, with different weights for each record, as follows:



SELECT * FROM questions 
WHERE weight >= FLOOR(RAND() * (SELECT MAX(weight) FROM questions))
ORDER BY weight
LIMIT 1


(note, I'm writing is answer in a hurry and without resource to test it; I haven't run this query so I may have got the syntax wrong, but the basic technique is sound)



This will pick a random number between zero and the largest weight value, and then find the question record that has the closest weight value to that random number.



Also, because the weight field is indexed, this query will be quick and efficient.



Downsides of this technique: It assumes that the weights for any given record won't change. If the weight of a record does need to change, then you would have to update the weight value for every record after it in the index.



[EDIT]



Let's imagine a table like this:



id   Name
1 Question One
2 Question Two
3 Question Three
4 Question Four
5 Question Five


In this example, we want Questions 1 and 2 to have a probability of 0.2, question 3 to have a probability of 0.1 and questions 4 and 5 to have a probability of 0.3. Those probabilities can be expressed as integers by multiplying them by 100. (multiply by 10 also works, but 100 means we can have probabilities like 0.15 as well)



We add the weight column and the index for it, and set the weight values as follows:



id   Name            Weight
1 Question One 20
2 Question Two 40 (ie previous value + 20)
3 Question Three 50 (ie previous value + 10)
4 Question Four 80 (ie previous value + 30)
5 Question Five 110 (ie previous value + 30)


Now we can run our query.



The random part of the query FLOOR(RAND() * (SELECT MAX(weight) FROM questions)) will select a value between zero and 110. Let's imagine it gives 68.



Now the rest of our query says to pick the first record where the weight is greater than 68. In this case, that means that the record we get is record #4.



This gives us our probability because the random number could be anything, but is more likely to select a given record if the gap between its weight and the one before it is larger. You'll get record #4 three times as often as record #3.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 26 '18 at 13:01

























answered Nov 26 '18 at 12:10









SpudleySpudley

140k33196276




140k33196276













  • thank you for your reply but I do not understand points 2,3 and 4

    – sonia
    Nov 26 '18 at 12:34











  • Re point 2: Add an index... indexes are a whole other topic, but a very important one if you're doing database work. It's too much to explain the whole concept of indexes here, but I strongly suggest taking a bit of time to learn them. There are lots of resources to learn from if you search. Re points 3,4,5: What about these points do you not understand? I've tried to explain it clearly. I will add an example to the end, but tell me if there's more I can do to explain it.

    – Spudley
    Nov 26 '18 at 12:49











  • Thank for your effort.I think that's not what I'm looking for. I modified my post to be clearer.

    – sonia
    Nov 26 '18 at 13:39











  • Fair enough. The update to the question definitely changes things. I'll leave this answer in place as it may help others.

    – Spudley
    Nov 26 '18 at 13:42



















  • thank you for your reply but I do not understand points 2,3 and 4

    – sonia
    Nov 26 '18 at 12:34











  • Re point 2: Add an index... indexes are a whole other topic, but a very important one if you're doing database work. It's too much to explain the whole concept of indexes here, but I strongly suggest taking a bit of time to learn them. There are lots of resources to learn from if you search. Re points 3,4,5: What about these points do you not understand? I've tried to explain it clearly. I will add an example to the end, but tell me if there's more I can do to explain it.

    – Spudley
    Nov 26 '18 at 12:49











  • Thank for your effort.I think that's not what I'm looking for. I modified my post to be clearer.

    – sonia
    Nov 26 '18 at 13:39











  • Fair enough. The update to the question definitely changes things. I'll leave this answer in place as it may help others.

    – Spudley
    Nov 26 '18 at 13:42

















thank you for your reply but I do not understand points 2,3 and 4

– sonia
Nov 26 '18 at 12:34





thank you for your reply but I do not understand points 2,3 and 4

– sonia
Nov 26 '18 at 12:34













Re point 2: Add an index... indexes are a whole other topic, but a very important one if you're doing database work. It's too much to explain the whole concept of indexes here, but I strongly suggest taking a bit of time to learn them. There are lots of resources to learn from if you search. Re points 3,4,5: What about these points do you not understand? I've tried to explain it clearly. I will add an example to the end, but tell me if there's more I can do to explain it.

– Spudley
Nov 26 '18 at 12:49





Re point 2: Add an index... indexes are a whole other topic, but a very important one if you're doing database work. It's too much to explain the whole concept of indexes here, but I strongly suggest taking a bit of time to learn them. There are lots of resources to learn from if you search. Re points 3,4,5: What about these points do you not understand? I've tried to explain it clearly. I will add an example to the end, but tell me if there's more I can do to explain it.

– Spudley
Nov 26 '18 at 12:49













Thank for your effort.I think that's not what I'm looking for. I modified my post to be clearer.

– sonia
Nov 26 '18 at 13:39





Thank for your effort.I think that's not what I'm looking for. I modified my post to be clearer.

– sonia
Nov 26 '18 at 13:39













Fair enough. The update to the question definitely changes things. I'll leave this answer in place as it may help others.

– Spudley
Nov 26 '18 at 13:42





Fair enough. The update to the question definitely changes things. I'll leave this answer in place as it may help others.

– Spudley
Nov 26 '18 at 13:42




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53448653%2fselect-rand-with-probability%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Costa Masnaga

Fotorealismo

Sidney Franklin