Coding: Fleeting Thoughts

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

User avatar
phlip
Restorer of Worlds
Posts: 7519
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

Re: Coding: Fleeting Thoughts

Postby phlip » Sun Jan 01, 2017 2:52 am UTC

speising wrote:
Maybe even more importantly, any operation on a float has to return another float, since the value may be outside the range representable by ints.

What's intfloor(2147483649.5)?

Well, we're talking about a language with bignums, so that's not a concern here.

The lowlevel C floor function, that makes sense for. And the lowlowlevel assembly floor function it makes even more sense, because it means that opcode can be implemented entirely by the floating point components and it doesn't need to connect to the integer registers.

But a higher-level function doesn't need to be beholden to the lowlevel API...

Code: Select all

enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}
[he/him/his]

User avatar
ahammel
My Little Cabbage
Posts: 2085
Joined: Mon Jan 30, 2012 12:46 am UTC
Location: Vancouver BC
Contact:

Re: Coding: Fleeting Thoughts

Postby ahammel » Sun Jan 01, 2017 3:17 am UTC

phlip wrote:
speising wrote:
Maybe even more importantly, any operation on a float has to return another float, since the value may be outside the range representable by ints.

What's intfloor(2147483649.5)?

Well, we're talking about a language with bignums, so that's not a concern here.
Even python doesn't have infinite and NaN bigints, though (right?).

I guess intfloor(NaN) could just throw an exception.
He/Him/His/Alex
God damn these electric sex pants!

User avatar
Yakk
Poster with most posts but no title.
Posts: 11012
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Sun Jan 01, 2017 5:43 am UTC

phlip wrote:For -21024 < x < 21024 (or thereabouts, I CBA to calculate the exact endpoints), every integer is exactly representable as a float, so you shouldn't get any rounding errors at all. Outside that range, every possible float value is an exact integer, so any rounding errors will take you to ints, not away from them (and in practise it just means that floor(x) == x, since x is always going to be integral).

Pigeonhole says that range is too large?
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

User avatar
phlip
Restorer of Worlds
Posts: 7519
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

Re: Coding: Fleeting Thoughts

Postby phlip » Sun Jan 01, 2017 1:48 pm UTC

Yeah, had a brainfart there. 21024 is just the maximum number you can fit in a double at all, not range of integers in doubles.

That cutover is at 252 to 253. Every integer with absolute value less than 253 can be represented exactly as a double, and every double bigger than 252 is exactly an integer.

Code: Select all

enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}
[he/him/his]

User avatar
ucim
Posts: 4800
Joined: Fri Sep 28, 2012 3:23 pm UTC
Location: The One True Thread

Re: Coding: Fleeting Thoughts

Postby ucim » Sun Jan 01, 2017 2:39 pm UTC

Ok, I didn't realize that floats could hold exact integers up to that big, and was confusing it with the results of float operations that should be integers not always coming out that way. For example, while 6/3 is always 2, is the float (or double) result of 6.0/3.0 guaranteed to be equal to the float representation of exactly 2?

If 6.0/3.0 is 1.99...9913, then floor(6.0/3.0) would be either 1 or 1.0; wrong either way. However, floor(6.0)/floor(3.0) would always be two if floor returned an int.

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.

Demki
Posts: 159
Joined: Fri Nov 30, 2012 9:29 pm UTC

Re: Coding: Fleeting Thoughts

Postby Demki » Sun Jan 01, 2017 4:37 pm UTC

Seeing as floating point division is done by dividing the mantissa and subtracting the exponents, in the case of 6.0/3.0, you should get exactly 2.0, as the mantissa are 1.1 and 1.1, and the exponents would be 2 and 1( so we have (1.1 × 2^(2))/(1.1 × 2^(1)) so we get (1.1/1.1) × 2^(1), and since the mantissa are represented as integers, we have 1 × 2^(1) which is 2

9/3 would be (1.001 × 2^(3))/(1.1 × 2^(1)), so we have (1.001)/(1.1) × 2^(2), and applying integer division to (1.001/1.1) (which is just 1001/11 × 2^(-2)) gives 11 × 2^(-2) × 2^(2) = 11 × 2^(0), and since floating point values need to be normalized(ie of the form 1.mantissa × 2^(exponent)) we have 1.1 × 2^(1)

So yes, you get the expected result for integers. (Note the mantissa is basically a fixed point number, where the point is before the leftmost position and 1 is prepended, since all numbers start with 1. In normalized form we save a bit by not representing it in the mantissa)

User avatar
ucim
Posts: 4800
Joined: Fri Sep 28, 2012 3:23 pm UTC
Location: The One True Thread

Re: Coding: Fleeting Thoughts

Postby ucim » Sun Jan 01, 2017 10:23 pm UTC

Demki wrote:So yes, you get the expected result for integers.


So then, why is

Code: Select all

float length;
float width;
length = 10.0;
for (width=1.0 ; width=width+.01 ; length==width)
   print ("still too long");
print ("Finally, perfect!");
wrong?

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.

korona
Posts: 495
Joined: Sun Jul 04, 2010 8:40 pm UTC

Re: Coding: Fleeting Thoughts

Postby korona » Sun Jan 01, 2017 11:00 pm UTC

ucim wrote:
Demki wrote:So yes, you get the expected result for integers.


So then, why is

Code: Select all

float length;
float width;
length = 10.0;
for (width=1.0 ; width=width+.01 ; length==width)
   print ("still too long");
print ("Finally, perfect!");
wrong?

Jose

Because floats cannot represent 1/100 exactly. Just like the decimal base cannot represent 1/3 exactly (if you only allow a finite number of digits). If you replace 0.01 by 0.25, 0.625 or similar (i.e. any number that is a sum of inverses of powers of 2) it works perfectly fine.

Demki
Posts: 159
Joined: Fri Nov 30, 2012 9:29 pm UTC

Re: Coding: Fleeting Thoughts

Postby Demki » Sun Jan 01, 2017 11:08 pm UTC

ucim wrote:
Demki wrote:So yes, you get the expected result for integers.


So then, why is

Code: Select all

float length;
float width;
length = 10.0;
for (width=1.0 ; width=width+.01 ; length==width)
   print ("still too long");
print ("Finally, perfect!");
wrong?

Jose

As Korona said, floating point values cannot accurately represent any fraction that does not have a power of 2 as its denominator.
Just curious, what language is that? First time I see a for loop in the order (initialization; increment; condition) instead of the order (initialization; condition; increment)

User avatar
ucim
Posts: 4800
Joined: Fri Sep 28, 2012 3:23 pm UTC
Location: The One True Thread

Re: Coding: Fleeting Thoughts

Postby ucim » Sun Jan 01, 2017 11:41 pm UTC

korona wrote:Because floats cannot represent 1/100 exactly.
Ah, ok. Makes sense. So if I did
floor(length)==floor(width)
then it wouldn't be wrong? (at least it wouldn't slither past the condition)

Demki wrote:Just curious, what language is that? First time I see a for loop in the order (initialization; increment; condition) instead of the order (initialization; condition; increment)
It's in C-mustard. It's an unintentional language that isn't taught much any more (though it is used more often than one might suspect). :)

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.

Demki
Posts: 159
Joined: Fri Nov 30, 2012 9:29 pm UTC

Re: Coding: Fleeting Thoughts

Postby Demki » Mon Jan 02, 2017 5:35 am UTC

ucim wrote:
korona wrote:Because floats cannot represent 1/100 exactly.
Ah, ok. Makes sense. So if I did
floor(length)==floor(width)
then it wouldn't be wrong? (at least it wouldn't slither past the condition)

If you did that you would indeed get it to terminate, it probably will not be after the expected number of iterations. You could actually just use(in this case) floor(width)==length
It is still better to order comparisons on floating points rather than equality.
ucim wrote:
Demki wrote:Just curious, what language is that? First time I see a for loop in the order (initialization; increment; condition) instead of the order (initialization; condition; increment)
It's in C-mustard. It's an unintentional language that isn't taught much any more (though it is used more often than one might suspect). :)

Jose

Well, looking for it I found there are way too many programming languages.

User avatar
ucim
Posts: 4800
Joined: Fri Sep 28, 2012 3:23 pm UTC
Location: The One True Thread

Re: Coding: Fleeting Thoughts

Postby ucim » Mon Jan 02, 2017 8:21 pm UTC

Demki wrote:Well, looking for it I found there are way too many programming languages.
Yes, there are.

"Mustard" is OTTish for "screwup". C-mustard is the language I inadvertently created a few posts ago. It's identical to C except that in some cases the for statement has its conditions in a different order. Alas, I never created proper documentation for it.

I guess I could have also named it after C-sharp, calling it C-flat-on-your-face. :)

Jose
Order of the Sillies, Honoris Causam - bestowed by charlie_grumbles on NP 859 * OTTscar winner: Wordsmith - bestowed by yappobiscuts and the OTT on NP 1832 * Ecclesiastical Calendar of the Order of the Holy Contradiction * Please help addams if you can. She needs all of us.

User avatar
Xanthir
My HERO!!!
Posts: 5055
Joined: Tue Feb 20, 2007 12:49 am UTC
Location: The Googleplex
Contact:

Re: Coding: Fleeting Thoughts

Postby Xanthir » Thu Jan 05, 2017 10:28 pm UTC

The important thing to remember, as said above, is that floats are not random. They don't produce weird results "just because". They're a base-2 value, with a movable "window" of precision. As a base-2 value, they can only represent values divisible by a power of 2. This is largely incompatible with base-10 fractions, tho, and so most base-10 values turn into an inexact base-2 value (just like anything not divisible by powers of 2 or 5 can't be written as an exact decimal in base-10), and then you get further slight divergence when you convert them back into base-10 for printing.
(defun fibs (n &optional (a 1) (b 1)) (take n (unfold '+ a b)))

User avatar
Qaanol
The Cheshirest Catamount
Posts: 2990
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Coding: Fleeting Thoughts

Postby Qaanol » Fri Jan 06, 2017 5:11 pm UTC

Xanthir wrote:The important thing to remember, as said above, is that floats are not random. They don't produce weird results "just because". They're a base-2 value, with a movable "window" of precision. As a base-2 value, they can only represent values divisible by a power of 2. This is largely incompatible with base-10 fractions, tho, and so most base-10 values turn into an inexact base-2 value (just like anything not divisible by powers of 2 or 5 can't be written as an exact decimal in base-10), and then you get further slight divergence when you convert them back into base-10 for printing.

In addition to the accuracy problem you describe, there is also an issue of precision loss when adding (or subtracting) floating-point values of disparate magnitude:

a = x + y
b = a - y // algebraically equals x, but won't always in practice
c = b - x // algebraically equals 0, but won't always in practice

This may come up when summing a large number of values. As the total gets bigger, the low-order bits of the values “fall off the end” and get discarded instead of being added to the sum.

You can correct for it via compensated summation, by keeping a tally of how much has been lost and adding it to the next value before summing.
wee free kings

User avatar
raudorn
Posts: 310
Joined: Mon Jun 13, 2011 11:59 am UTC

Re: Coding: Fleeting Thoughts

Postby raudorn » Mon Jan 09, 2017 10:00 pm UTC

Assume initialization and bounds checking in the following.

Code: Select all

class Grid {
private:
    double* _data;
   
    double Cell(int pos) {
        return _data[pos];
    }
   
public:
    void crunch() {
        for(int i = 0; i < largeNumber; i++) {
            // Some arithmetic expression like
            float foo = (this->Cell(i) + this->Cell(i+1)) * this->Cell(i-1) + this->Cell(i);
        }
    }
}

Runtime: x seconds

Code: Select all

class Grid {
private:
    double* _data;
   
public:
    void crunch() {
        for(int i = 0; i < largeNumber; i++) {
            // Some arithmetic expression like
            float foo = (_data[i] +_data[i+1]) * _data[i-1] + _data[i];
        }
    }
}

Runtime: 3 * x seconds

[incoherent, angry screaming]
[breaking computer noises]
[sobbing]

korona
Posts: 495
Joined: Sun Jul 04, 2010 8:40 pm UTC

Re: Coding: Fleeting Thoughts

Postby korona » Mon Jan 09, 2017 10:35 pm UTC

And that, kids, is why we pass -O2 on the command line.
(Runtime: 0 seconds because the dead store and the loop is eliminated :P)

Seriously though, it does look quite strange. What compiler / options were you using? Can you paste the real code (the real calculation and initialization and bounds checking?

It could be a cache thrashing artifact.

User avatar
raudorn
Posts: 310
Joined: Mon Jun 13, 2011 11:59 am UTC

Re: Coding: Fleeting Thoughts

Postby raudorn » Mon Jan 09, 2017 11:01 pm UTC

I'm afraid the real code is a bit too lengthy to be properly discussed here. Basically we have a data grid that calculates various derivatives like

Code: Select all

real_t Grid::DC_vdv_y(const Iterator &it, const real_t &alpha) const {
  real_t ft = pow((this->Cell(it) + this->Cell(it.Top())), 2.0)
    - pow((this->Cell(it.Down()) + this->Cell(it)), 2.0);
  real_t st = fabs(this->Cell(it) + this->Cell(it.Top())) * (this->Cell(it) - this->Cell(it.Top()))
    - fabs(this->Cell(it.Down()) + this->Cell(it)) * (this->Cell(it.Down()) - this->Cell(it));
  return (0.25 * (ft + alpha * st)) / _geom->Mesh()[1];
}

and I thought "Hm, all those this->Cell(it) calls are superfluous, since they just access the data that is stored in the 'this' instance anyway." So I replaced them with direct calls to _data[it] and suddenly performance is three times slower. Default flags are "-Wall", "-Wextra", "-pedantic", "-std=c++11" and "-O3". Honestly, I'm not sure what they all do, since this is a class project and we use a given build system.

User avatar
Flumble
Posts: 1598
Joined: Sun Aug 05, 2012 9:35 pm UTC

Re: Coding: Fleeting Thoughts

Postby Flumble » Tue Jan 10, 2017 2:17 am UTC

raudorn wrote:Default flags are "-Wall", "-Wextra", "-pedantic", "-std=c++11" and "-O3". Honestly, I'm not sure what they all do, since this is a class project and we use a given build system.

The first three are for printing out every possible warning (any sign that you might not strictly follow the C++ standards), the std flag is for compiling like it's C++11 (rather than an older or newer version of the language) and -O3 is for optimizing the executable for speed (rather than memory usage, executable size or whatever).

Could it somehow be because of the int argument of Cell? I'd try running it with size_t as argument type and see if it runs thrice as slow.

User avatar
Yakk
Poster with most posts but no title.
Posts: 11012
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Tue Jan 10, 2017 11:00 am UTC

Another possivility is that the conversion to value means you don't have a `double&` but instead a `double`; ie, more locality. Simpler locality (working with temporary copies, not references to data in the array directly) could make a difference. You can emulate this with a `static_cast<double>( m_data[it] )`.

So cast in-to-`int` and out-to-`double` separately and together and see if any get faster.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

User avatar
raudorn
Posts: 310
Joined: Mon Jun 13, 2011 11:59 am UTC

Re: Coding: Fleeting Thoughts

Postby raudorn » Tue Jan 10, 2017 3:45 pm UTC

So, today I tried to replicate the problem... and couldn't. The same changes to the code now had no measurable impact on performance at all. I would've liked to look further into it, but we had more pressing concerns.

Yakk wrote:Another possivility is that the conversion to value means you don't have a `double&` but instead a `double`; ie, more locality.

I assumed something along those lines, but since the artifact went away, I doubt it again.

I guess the moral here is that how you measure performance is just as important and non-trivial than improving it.

liberonscien
Posts: 93
Joined: Thu Jan 21, 2016 7:04 pm UTC

Re: Coding: Fleeting Thoughts

Postby liberonscien » Sat Jan 14, 2017 9:14 pm UTC

Something occurred to me a while ago:
Is it possible to code a program that randomly generates new code, which the program then tests?
Like assign a number to every code piece, generate a string of pseudorandom numbers, translate that string into a generated code which the program then runs.
There would have to be failsafes in place that check and see if the generated code is generating errors. If it detects errors in the debug session it would start over and generate a new code. Also, if it somehow generates a code that loops, it would have to have a system in place that restarts the generation after a specified period of time.
For this to have any value at all, it would need to be able to save its generated codes.

I find myself wondering if something like this is possible, theoretically speaking, or is it impossible to create something like this?
I tend to be rather formal.
"People who ask questions that can be answered by a quick Google search are looking for interaction, not answers."

User avatar
raudorn
Posts: 310
Joined: Mon Jun 13, 2011 11:59 am UTC

Re: Coding: Fleeting Thoughts

Postby raudorn » Sat Jan 14, 2017 10:26 pm UTC

It's certainly possible, though I can't point to an existing implementation at the moment. That said, I can think of a few problems from the musing I did on the same topic:

1.) The halting problem means that we can't have a function that tells us whether the generated program will halt. What we can do is have a bunch of heuristics that will check against unreasonably long execution time or maybe check the loop conditions, but this will result in false negatives, e.g. programs that are useful and will halt, but not in the arbitrary time limit we set.
2.) The space of syntactically correct programs is vastly larger than the space of syntactically correct and useful programs, so we can't really expect to get a useful result anytime soon, where soon can take values like the lifespan of the universe.
3.) Checking the result is non-trivial in two ways:
a) If we check only the return value, we ignore useful programs that do not return anything, but have useful side effects. I guess in some languages all side effects are considered evil, so maybe in those it's not a problem.
b) We either can only check for classes of test parameters (e.g. test for -1, 0, 1, 2, 99999 and NaN) or we restrict them to a necessarily narrow space and test all combinations. The first is not guarranteed to properly check against invalid behaviour, since it can be hidden in parts of the parameter space that are not covered by the test. The second one is impossible in practice, unless you have a problem with an extremely narrow parameter space.

Tub
Posts: 230
Joined: Wed Jul 27, 2011 3:13 pm UTC

Re: Coding: Fleeting Thoughts

Postby Tub » Sun Jan 15, 2017 2:32 am UTC

The good news: programs are just finite-length strings of bits, so you can enumerate them if you want.

The bad news: proper testing is impossible (not just due to the halting problem), and heuristic testing will pass too many broken programs.

Say, you're testing a sorting algorithm on [], [1,2,3,4,5] and [3,2,5,4,1]. That might otherwise qualify as reasonable test cases. But which of the following two programs would you expect to be randomly generated first?

Code: Select all

for i = 1 to length(input)
  for j = 2 to length(input)
    if input[i] < input[i-1]
      t = input[i]
      input[i] = input[i-1]
      input[i-1] = t
return input

Code: Select all

if (input == [])
  return []
return [1,2,3,4,5]


Any course on theoretical computer science will teach you why brute-force programming is a bad idea. Even if you bypass the halting problem with a timeout, and even if you bypass proper testing with a handful of testcases, your runtime is still exponential against code length, and any code of value is too long for a computer to randomly generate within any reasonable timespan.
Seriously, there are hard upper bounds on the computing capacity of the whole observable universe, and if you do the math, then you're running out of universe before you've tested programs with maybe 50 characters. In some languages, that's not even enough for "hello world".

User avatar
phlip
Restorer of Worlds
Posts: 7519
Joined: Sat Sep 23, 2006 3:56 am UTC
Location: Australia
Contact:

Re: Coding: Fleeting Thoughts

Postby phlip » Sun Jan 15, 2017 3:24 am UTC

It's also worth name-dropping Rice's theorem, a generalisation of the halting problem that any question like "is this a sorting algorithm" is undecidable.

Code: Select all

enum ಠ_ಠ {°□°╰=1, °Д°╰, ಠ益ಠ╰};
void ┻━┻︵​╰(ಠ_ಠ ⚠) {exit((int)⚠);}
[he/him/his]

Tub
Posts: 230
Joined: Wed Jul 27, 2011 3:13 pm UTC

Re: Coding: Fleeting Thoughts

Postby Tub » Mon Jan 16, 2017 6:15 pm UTC

RRDTool seems like a pretty useful thing. It can store regular measurements (like CPU temperature, system load etc) in a way that avoids the overhead of RDBMS et al, and it'll automatically provide lower-resolution series with different cutoffs. For example, store the last 24 hours in 1-minute accuracy, the last week with 1 hour accuracy, the last year with daily accuracy.

Neat.

There's a whole ecosystem of tools that'll collect your data and insert it into a rrdtool database, making it pretty easy to setup comprehensive monitoring for your servers, your network devices, your home automation etc. collectd seems lightweight, feature-rich and modular.

Great.

It'll also produce little graphs from the stored data, With just a few lines of code, you can program a dashboard for your servers' status (or pick one of several existing implementations).

Awesome.

To produce the graphs, it uses cairo. When I want to install rrdtool, the dependencies include cairo, mesa, libdrm, several X libraries and wayland. On a headless server.

What the fuck. Who considered that to be a good idea?

korona
Posts: 495
Joined: Sun Jul 04, 2010 8:40 pm UTC

Re: Coding: Fleeting Thoughts

Postby korona » Mon Jan 16, 2017 8:30 pm UTC

At least on Debian RRDTool just pulls in cairo. This is fine: cairo is a generic vector graphics drawing library and fine for headless systems; it's the right tool for the job. However cairo is built with support for output to X11 surfaces and thus pulls in X11 libraries (in wayland based systems X11 is replaced by a combination of wayland and OpenGL, i.e. mesa + libdrm). That makes sense if you're running a Desktop (GTK+ draws via cairo; I'm not sure about Qt) but not so much if you're on a headless server. I do not know if cairo's upstream supports building the X11 part as a separate module; in any case this seems to be cairo's or the maintainer's fault and not a problem of RRDTool.


Return to “Coding”

Who is online

Users browsing this forum: No registered users and 6 guests