Question Engine 2:Numerical formats: Difference between revisions

Revision as of 14:00, 15 May 2011

The Question Engine 2 structure allows implementation of new features for numerical question types ( numerical and calculated).

This page describes a possible implementation and its rationale.

The text should be readed as a personal summary of the work in progress and not as a textbook on computer language and real numbers representation.

It is related to an initial quiz forum discussion http://moodle.org/mod/forum/discuss.php?d=172211#p755927

Please put your comments on the forum. http://moodle.org/mod/forum/discuss.php?d=174387

Numbers

from http://en.wikipedia.org/wiki/Number

The real numbers include all of the measuring numbers. Real numbers are usually written using decimal numerals, in which a decimal point is placed to the right of the digit with place value one. Each digit to the right of the decimal point has a place value one-tenth of the place value of the digit to its left.

Thus

123.456

represents 1 hundred, 2 tens, 3 ones, 4 tenths, 5 hundredths, and 6 thousandths.

In the US and UK and a number of other countries, the decimal point is represented by a period, whereas in continental Europe and certain other countries the decimal point is represented by a comma.

Zero is often written as 0.0 when it must be treated as a real number rather than an integer.

In the US and UK a number between −1 and 1 is always written with a leading zero to emphasize the decimal.

Negative real numbers are written with a preceding minus sign: -123.456.

Every rational number is also a real number. It is not the case, however, that every real number is rational. If a real number cannot be written as a fraction of two integers, it is called irrational number.

A decimal that can be written as a fraction either ends (terminates) or forever repeating decimal, because it is the answer to a problem in division.

Thus the real number 0.5 can be written as 1/2 and the real number 0.333... (forever repeating threes, otherwise written 0.overline|3}}) can be written as 1/3. On the other hand, the real number π , the ratio of the circumference of any circle to its diameter, is pi = 3.14159265358979 Since the decimal neither ends nor forever repeats, it cannot be written as a fraction, and is an example of an irrational number.

Thus 1.0 and 0.999... are two different decimal numerals representing the natural number 1.

There are infinitely many other ways of representing the number 1, for example 2/2, 3/3, 1.00, 1.000, and so on.

Every real number is either rational or irrational. Every real number corresponds to a point on the number line.

When a real number represents a measurement, there is always a margin of error. This is often indicated by rounding or truncate|truncating a decimal, so that digits that suggest a greater accuracy than the measurement itself are removed. The remaining digits are called significant digits.

For example, measurements with a ruler can seldom be made without a margin of error of at least 0.01 meters. If the sides of a rectangle are measured as 1.23 meters and 4.56 meters, then multiplication gives an area for the rectangle of 5.6088 square meters. Since only the first two digits after the decimal place are significant, this is usually rounded to 5.61.

Numerical question grading

There are 3 elements that can be graded in a numerical question

the numeric value
the unit used in relation to the numeric value
the number format used to express the value

1 and 2 are addressed in Moodle 2,0 for questions created from edit_numerical_form.php.

3 can give a 0 grade if the student does not use one of the number formats allowed but there is no specific grading designed as the unit penalty since Moodle 2,0.

Designing a number format penalty

We have already 2 ways to grade the student response available in the edit_numerical_form.php. i.e.

the answer-tolerance combination
the detailed unit grading

We need flexibility to take in account that number format can vary even at a same location i.e. in Canada there are two locales as there are two official languages (french and english).

Teacher sometimes ask for specific number formats as fraction that cannot be allowed simultanuously to other grading.

The most universal solution is to associate a grade to a specific answer numerical format as the tolerance is used.

For example in calculated the tolerance can be set as relative or absolute.

So my proposal is to add the number format as a new answer parameter.

To an official locales list of available formats we could add

specific Moodle formats as the one used for 1,9 2,0 ,
a fraction format,
time format,
degree, minute, second
etc.

Numbers as written by human and readed by computer language

The main feature of numerical question type is to ask the student to give a numerical answer i.e. a number. Most often this means a numerical value that is not an integer ( dates are a current example of integer value response) but a real number which value is expressed most often as a decimal number i.e 1.234 .

Computer languages ( i.e PHP used in Moodle) store real numbers in a different way than human do (decimal part and exponent similar to 1.234 E00) and humans do not expressed real nmumbers in an universal format.

The separator between the unit and the decimal fraction is often either a . or a ,  
1.234   1,234

Furthermore to help reading large numbers, most language add another separator for thousands often space or , if it is not used already as unit separator.

123 456.78     123,456.78   123 456,78

PHP as a computer language use space to separate the language components so cannot use space as a thousand separator.

, is also used to separate variables so PHP use a simpler syntax 123456.78.

As this is stored in the computer as 2 parts (number and exponent).

1.2345678+E05 will be a structure that is well recognized by a computer language as PHP.

Computers as dummy readers of human number writing

Ordinary peoples know from their experiences with hand calculators or even sophisticated spreadsheets that computer are more or less "stupid" to recognized correctly the numbers or numerical values typed by humans.

We as human need to know the number style that the computer will recognize as this can vary among the different computer usages.

Typing a number in your on-line tax report is not the same as typing a number inside a complex math formula. Tax reports will handle well various number formats i.e.

with or without the $ sign,
with or without the space or , as thousand separator
will not accept exponential formating as $ 1e6 etc,

However when writing a mathematical formula you cannot

cannot put units i.e $
space or , as thousands sep in numbers etc.

Even if you type a number in a format, the computer could display it in another format.

You type in a spreadsheet =1e-3

On return the display could be 0.001

Number formats can be different following their usage : i.e tax reports and mathematical formula

Humans have to learn which number format to use in a  given computer software case

Moodle 1,9 and 2,0 users and numbers in a numerical and calculated question types

There are two main cases

The teacher setting the answer

in a numerical question answer

numerical questiontype IS NOT a shortanswer questiontype

At first sight we think that the number should be typed as the response that the student will type to have a full grade.

If you want to do this, then you should use shortanswer questiontype.

In numerical questiontype the answer is a numerical value or quantity.

This is why we add tolerance grading to the numerical quantity.

There is no case sensitivity in numerical question, case sensitivity in shortanswer question being the analogous to the numerical tolerance or vice versa...

So when a teacher set the answer field ( and the tolerance field) in a numerical question he sets a value or a numerical quantity and PHP will understood this numercial quantity using a decimal format as 1234.56 or in exponential format as 1.23456 E3

notice that PHP in moodle always use . as decimal point.

in the edit_numerical_form or Gift import form these format with . are mandatory.

in cloze or other import the decimal point can be , in moodle 2,0.

in a calculated question formula answer

In calculated questiontype the numerical value or a numerical quantity will be obtained using a mathematical equation that among other element ( function, math symbol +-*/ etc.) will contain numerical quantities expressed as number.

The number in the equation is written in strict PHP rule decimal format as 1234.56 or in exponential format as 1.23456 E3.

The student typing its response in numerical or calculated question

in 1,9 the only decimal point allowed was .

in 2,0 the decimal point could be . or ,

in 1,9 and 2,0 the thousands separator space or , (when decsep is .) where removed as need by PHP syntax.

THE EASIEST RULE FOR TEACHERS OR STUDENTS IS: ALWAYS USE PHP DECIMAL FORMAT as 1234.56 or as 1.23456 E3;

This also means that in the actual code when a teacher set the answer field , he DOES NOT set the response format

Grading the response format : truth table

In the actual code 2,0 if the the decimal point and thousands sep are not set correctly by the student, he could have a zero grade to a correct numerical response.

However a clever student will never use thousand separators and will learn once for all if he can use , or . as unit separator.

The actual 2,1 proposal is to grade the numerical format separately from the numerical quantity.

On analysis it appears that the best way to do this as there is already a unit penalty applied , is to add at least the decimal point as an additional answer parameter alongside the tolerance.

A complete grading should also allow the teacher to specify the thousand separator as mandatory or control the use of the exponential format.

given the time delays for 2,1, let's design a decimal decimal point option.

The following table describe the results of the various versions

Moodle Version	decsep	thousand sep	1 234.56	1,234.56	1234.56	1.23456E3	1 234,56	1.234,56	1234,56	1,23456E3
1,9			OK	OK	OK	OK	NO	NO	NO	NO
2,0			OK	OK	OK	OK	OK	NO	OK	OK
2,1	.	,	NO	OK	OK	OK	NO	NO	NO	NO
2,1	.	space	OK	NO	OK	OK	NO	NO	NO	NO
2,1	,	space	NO	NO	NO	NO	OK	NO	OK	OK
2,1	,	.	NO	NO	NO	NO	NO	OK	OK	OK
2,1 proposal			OK	OK	OK	OK	OK	OK	OK	OK
2,1 proposal	.		OK	OK	OK	OK	NO	NO	NO	NO
2,1proposal	,		NO	NO	NO	NO	OK	OK	OK	OK

OR

Moodle Version	decsep	thousand sep	1 234.56	1,234.56	1234.56	1.23456E3	1 234,56	1.234,56	1234,56	1,23456E3
1,9			OK	OK	OK	OK
2,0			OK	OK	OK	OK	OK		OK	OK
2,1	.	,		OK	OK	OK
2,1	.	space	OK		OK	OK
2,1	,	space					OK		OK	OK
2,1	,	.						OK	OK	OK
2,1 proposal			OK	OK	OK	OK	OK	OK	OK	OK
2,1 proposal	.		OK	OK	OK	OK
2,1proposal	,						OK	OK	OK	OK

In 2,1 there is no 'universal option' that allow most of the known options and in 2,0 the german language 1.234,56 is not recognized.

In 1,9 and 2,0 the PHP structure is always good In 2,1 the PHP structure is good if you use the good thousands sep

So there is no real control on the thousand sep as long as the students learn to not using it.

As a first step in grading the response format in 2,1 , I propose that we limit the case to grade the decsep allowing either , or . and using a modified 2,0 version to handle the german . ,

From the the table we can see, with the proposal, that upgrading or importing from 1,9 the default decsep should be set to . and upgrading upgrading or importing from 2,0 the default decsep should be set to nil i.e. universal.

We could as set in 2,0 help warning the students that is they use , or . as thousand separators they MUST put the decimal separator.

In 2,2 we could develop grading of thousand sep along with the control of the exponential form.

In all cases we need an "universal" solution for Cloze questiontype. In calculated question the interface is already quite complex and perhaps things will not be implemented for 2,1 .

How to grade a numerical response

Feasability of adding number format as an additional numerical answer table

The answer field being set as TEXT can easily store a numerical value in any format that can be validated easily in edit_numerical_form.php and the


   /**
    * Get an answer that contains the feedback and fraction that should be
    * awarded for this resonse.
    * @param number $value the numerical value of a response.
    * @return question_answer the matching answer.
    */
   public function get_matching_answer($value) {
       foreach ($this->answers as $aid => $answer) {
           if ($answer->within_tolerance($value)) {
               $answer->id = $aid;
               return $answer;
           }
       }
       return null;
   }

modified accordingly.

The code flow should allow to compare each answers to the different values that result from the response text.

Aswers need to be compared with the value converted using the apply_unit() which could be different form the decsep value used, then either the $value is an array of all the values from the different decseps or the numerical value is computed inside this function from the student text response. This later option could be the most flexible if we add different number formats as fractional.

This is similar to tolerance handling which is another answer parameter .

...

Pierre Pichet 21:52, 15 May 2011 (WST)

Setting the answer when creating the question

A proposal

The teacher should be able to set

the numerical value using the PHP convention
the response format , various interface can be used
the tolerance
the feedback

Here a simplified proposal

1,9 and 2,0 interface

In Moodle 1,9 and 2,0 in the edit_numerical_form.php, the number enter by the teacher (although student could be allowed to create question, I will use teacher for text clarity) must be conform to the PHP syntax (no thousand separator or space and . as decimal separator. E syntax is allowed.

                if (!(is_numeric($trimmedanswer) || $trimmedanswer == '*')) {
                   $errors['answer[' . $key . ']'] =
                           get_string('answermustbenumberorstar', 'qtype_numerical');
               }

So the teacher must know the PHP specific number syntax.

In the numerical/questiontype.php function save_question_options($question) there is an additional verification mostly for numerical questions imported through various formats or inside a Cloze numerical multianswer question.

            
               $answer->answer = $this->apply_unit($answerdata, $units);
               if ($answer->answer === false) {
                   $result->notice = get_string('invalidnumericanswer', 'quiz');
               }

the Moodle 2,0 apply_unit

   
   /**
    * Checks if the $rawresponse has a unit and applys it if appropriate.
    *
    * @param string $rawresponse  The response string to be converted to a float.
    * @param array $units         An array with the defined units, where the
    *                             unit is the key and the multiplier the value.
    * @return float               The rawresponse with the unit taken into
    *                             account as a float.
    */
   function apply_unit($rawresponse, $units) {

       // Make units more useful
       $tmpunits = array();
       foreach ($units as $unit) {
           $tmpunits[$unit->unit] = $unit->multiplier;
       }
       // remove spaces and normalise decimal places.
       $rawresponse = trim($rawresponse) ;
       $search  = array(' ', ',');
       // test if a . is present or there are multiple , (i.e. 2,456,789 ) so that we don't need spaces and ,
       if ( strpos($rawresponse,'.' ) !== false || substr_count($rawresponse,',') > 1 ) {
           $replace = array(, );
       }else { // remove spaces and normalise , to a . .
           $replace = array(, '.');
       }
       $rawresponse = str_replace($search, $replace, $rawresponse);

       // Apply any unit that is present.
       if (ereg('^([+-]?([0-9]+(\\.[0-9]*)?|\\.[0-9]+)([eE][-+]?[0-9]+)?)([^0-9].*)?$',
               $rawresponse, $responseparts)) {

echo"

 responseparts 
";print_r($responseparts) ;echo"

";

           if (!empty($responseparts[5])) {

               if (isset($tmpunits[$responseparts[5]])) {
                   // Valid number with unit.
                   return (float)$responseparts[1] / $tmpunits[$responseparts[5]];
               } else {
                   // Valid number with invalid unit. Must be wrong.
                   return false;
               }

           } else {
               // Valid number without unit.
               return (float)$responseparts[1];
           }
       }
       // Invalid number. Must be wrong.
       return false;
   }

The 2,0 apply_unit allows more number formats than the test in the editing form.

regular numbers 13500.67 : 13 500.67 : 13500,67: 13 500,67

if you use , as thousand separator *always* put the decimal . as in 13,500.67 : 13,500.

for exponent form, say 1.350067 * 10⁴, use 1.350067 E4 : 1.350067 E04 ';

The 1,9 apply_unit is more restrictive as it does not support , as decimal separator.

More formats were allowed in 2,0 as the main objective in a numerical question is the numerical value. More about this further in the page (todo)

Retrieving the numerical value from the student response

As the function apply_unit() is also used to analyze the student response the number formats allowed are the same as the formats allowed for NON edit_numerical_form.php numerical i.e. import or Cloze.

However to help students know what are the number formats allowed an help icon aside the number text input element is shown on Moodle 2,0 numerical questions created either by edit_numerical_form.php or import but not for Cloze.

The actual code does not detect number format

The historical objective in the apply_unit() was to convert various number formats so that they comply to the PHP norm for a number.

The most current thousand separators( i.e space and ,) are filtered out.

In 2,0 the , as unit separator is replaced by .

In all cases PHP numerical format is valid i.e. 123456.78 or 1.2345678e5

A clever student will never use thousand separators and will learn once for all if he can use , or . as unit separator.

Shortanswer is a better way to test for number formats...

Converting number in a string to a double or float variable

The answer is being stored in a TEXT database field or in a string in import-export files.

However in numerical questiontype code it is used a numeric PHP parameter (float or double). From http://ca2.php.net/manual/en/language.types.string.php#language.types.string.conversion

String conversion to numbers

When a string is evaluated in a numeric context, the resulting value and type are determined as follows.

If the string *does not contain* any of the characters '.', 'e', or 'E' and the numeric value fits into integer type limits (as defined by PHP_INT_MAX), the string will be evaluated as an integer. In all other cases it will be evaluated as a float.

The value is given by the initial portion of the string. If the string starts with valid numeric data, this will be the value used. Otherwise, the value will be 0 (zero). Valid numeric data is an optional sign, followed by one or more digits (optionally containing a decimal point), followed by an optional exponent. The exponent is an 'e' or 'E' followed by one or more digits.

For more information on this conversion, see the Unix manual page for strtod(3).

Unix manual page for strtod

http://compute.cnr.berkeley.edu/cgi-bin/man-cgi?strtod+3 DESCRIPTION

    The strtod(), strtof(), and strtold() functions convert  the
    initial  portion of the string pointed to by nptr to double,
    float, and long double representation,  respectively.  First
    they decompose the input string into three parts:
    1.  An initial,  possibly  empty,  sequence  of  white-space
        characters (as specified by isspace(3C))
    2.  A subject sequence interpreted as a floating-point  con-
        stant or representing infinity or NaN
    3.  A final string of one or more  unrecognized  characters,
        including the terminating null byte of the input string.
    Then they attempt to  convert  the  subject  sequence  to  a
    floating-point number, and return the result.
    The expected form of the subject  sequence  is  an  optional
    plus or minus sign, then one of the following:
      o  A non-empty sequence of digits optionally containing  a
         radix character, then an optional exponent part
      o  A 0x or 0X, then a non-empty  sequence  of  hexadecimal
         digits optionally containing a radix character, then an
         optional binary exponent part
      o  One of INF or INFINITY, ignoring case

...

    The radix character  is  defined  in  the  program's  locale
    (category  LC_NUMERIC).  In the POSIX locale, or in a locale
    where the radix character is not defined, the radix  charac-
    ter defaults to a period ('.').

So if the locale LC_NUMERIC i.e. Array (

   [decimal_point] => .
   [thousands_sep] => 
   [int_curr_symbol] => 
   [currency_symbol] => 
   [mon_decimal_point] => 
   [mon_thousands_sep] => 
   [positive_sign] => 
   [negative_sign] => 
   [int_frac_digits] => 127
   [frac_digits] => 127
   [p_cs_precedes] => 127
   [p_sep_by_space] => 127
   [n_cs_precedes] => 127
   [n_sep_by_space] => 127
   [p_sign_posn] => 127
   [n_sign_posn] => 127
   [grouping] => Array
       (
       )

   [mon_grouping] => Array
       (
       )

define decimal_point not as . as here but as , then 1,234.56 will be recognized as 1,234 i.e. smaller than 2 although the number written is greater than 1000.

PHP 5.36 zend_strtod()

However in PHP 5.3.6 strtod() is replaced by zend_strtod()

ZEND_API double zend_strtod (CONST char *s00, char **se)
{

 int bb2, bb5, bbe, bd2, bd5, bbbits, bs2, c, dsign,
e, e1, esign, i, j, k, nd, nd0, nf, nz, nz0, sign;
CONST char *s, *s0, *s1;
volatile double aadj, aadj1, adj;
volatile _double rv, rv0;
Long L;
ULong y, z;
Bigint *bb, *bb1, *bd, *bd0, *bs, *delta, *tmp;
double result;
 CONST char decimal_point = '.';

...

 z = 10*z + c - '0';
nd0 = nd;
if (c == decimal_point) {
c = *++s;
if (!nd) {
for(; c == '0'; c = *++s)
nz++;
if (c > '0' && c <= '9') {
s0 = s;
nf += nz;
nz = 0;
goto have_dig;
}

So in PHP 5.3.6 the zend_strtod() ALWAYS use . as decimal_point

Other functions allow the user to define the decimal_point or use the locale defined decimal_point.

is_numeric()

/**

* Checks whether the string "str" with length "length" is numeric. The value
* of allow_errors determines whether it's required to be entirely numeric, or
* just its prefix. Leading whitespace is allowed.
*
* The function returns 0 if the string did not contain a valid number; IS_LONG
* if it contained a number that fits within the range of a long; or IS_DOUBLE
* if the number was out of long range or contained a decimal point/exponent.
* The number's value is returned into the respective pointer, *lval or *dval,
* if that pointer is not NULL.
*/

static inline zend_uchar is_numeric_string(const char *str, int length, long *lval, double *dval, int allow_errors)
{
const char *ptr;
int base = 10, digits = 0, dp_or_e = 0;
double local_dval;
zend_uchar type;
 if (!length) {
return 0;
}
 /* Skip any whitespace
* This is much faster than the isspace() function */
while (*str == ' ' || *str == '\t' || *str == '\n' || *str == '\r' || *str == '\v' || *str == '\f') {
str++;
length--;
}
ptr = str;
 if (*ptr == '-' || *ptr == '+') {
ptr++;
}
 if (ZEND_IS_DIGIT(*ptr)) {
/* Handle hex numbers
* str is used instead of ptr to disallow signs and keep old behavior */
if (length > 2 && *str == '0' && (str[1] == 'x' || str[1] == 'X')) {
base = 16;
ptr += 2;
}
 /* Skip any leading 0s */
while (*ptr == '0') {
ptr++;
}
 /* Count the number of digits. If a decimal point/exponent is found,
* it's a double. Otherwise, if there's a dval or no need to check for
* a full match, stop when there are too many digits for a long */
for (type = IS_LONG; !(digits >= MAX_LENGTH_OF_LONG && (dval || allow_errors == 1)); digits++, ptr++) {
check_digits:
if (ZEND_IS_DIGIT(*ptr) || (base == 16 && ZEND_IS_XDIGIT(*ptr))) {
continue;
} else if (base == 10) {
if (*ptr == '.' && dp_or_e < 1) {
goto process_double;
} else if ((*ptr == 'e' || *ptr == 'E') && dp_or_e < 2) {
const char *e = ptr + 1;
 if (*e == '-' || *e == '+') {
ptr = e++;
}
if (ZEND_IS_DIGIT(*e)) {
goto process_double;
}
}
}

break; }

The decimal point is hard written in the code if (*ptr == '.' && dp_or_e < 1), so the test will not be locale dependent.

Note that this function allows hexadecimal written numbers so in numerical we need another function (like apply_unit() to only use decimals.

@@ Line 446: / Line 446: @@
 In all cases we need an "universal" solution for Cloze questiontype.
-In calculated question the interface is already quite complex and perhaps things will not be implemented for 2,1 [[User:Pierre Pichet|Pierre Pichet]] 11:26, 14 May 2011 (WST)
+In calculated question the interface is already quite complex and perhaps things will not be implemented for 2,1 .
 ==How to grade a numerical response==

Documentation