Homework Solution: Consider the largest single precision floating point number,…

    Floating Point “Question” 1)Consider the largest single precision floating point number, 1.11…1 x 2127. What will happen the “gap” at that number if you change the type to double precision? DO NOT discuss the largest double precision number – just the largest single precision represented as a double type. 2)Now, consider the smallest single precision floating point number, 2-149. What will happen the “gap” at that number if you change the type to double precision? DO NOT discuss the smallest double precision number – just the largest single precision represented as a double type.

    Expert Answer

     
    Solution: 1) Before proceeding let's discuss double precision an

    Discurrent Top “Question”

    1)Regard the largest unmarried exactness discurrent top sum, 1.11…1 x 2127. What conquer occur the “gap” at that sum if you veer the kind to envelop exactness? DO NOT examine the largest envelop exactness sum – normal the largest unmarried exactness represented as a envelop kind.

    2)Now, regard the lowest unmarried exactness discurrent top sum, 2-149. What conquer occur the “gap” at that sum if you veer the kind to envelop exactness? DO NOT examine the lowest envelop exactness sum – normal the largest unmarried exactness represented as a envelop kind.

    Expert Vindication

     

    Solution:

    1)

    Before performance let’s examine envelop exactness and how the symptom, interpreter, and Mantissa is categorized in it.

    So envelop exactness is 64-bit discurrent top and the details are consecrated below:

    So when largest unmarried exactness sum conquer be represented as envelop exactness it conquer seem approve this,

    Sign Biased Interpreter Mantissa
    0 10001111110 1111111111111111111111100000000000000000000000000000

    So the Bias ce envelop exactness sum is +1023

    and Biased Interpreter= Actual interpreter+Bias= 127+1023= 1150 (int binary: 10001111110)

    and in the mantissa, succeeding 23 1’s complete the bits conquer be padded with 0’s.

    2)

    So the lowest unmarried exactness sum 2^-149 conquer be represented as shown below:

    Sign Biased Interpreter Mantissa
    0 01101101010 0000000000000000000000000000000000000000000000000000

    So the Bias ce envelop exactness sum is +1023

    and Biased Interpreter= Actual interpreter+Bias= (-149)+1023= 874 (int binary 01101101010)