Coding: Fleeting Thoughts

A place to discuss the implementation and style of computer programs.

Moderators: phlip, Moderators General, Prelates

User avatar
Qaanol
The Cheshirest Catamount
Posts: 3040
Joined: Sat May 09, 2009 11:55 pm UTC

Re: Coding: Fleeting Thoughts

Postby Qaanol » Tue Dec 05, 2017 2:01 am UTC

Xeio wrote:Todays Google doodle is nice.

They appear to measure “shortest solution” in terms of fewest instructions, not least movement.
wee free kings

User avatar
hotaru
Posts: 1025
Joined: Fri Apr 13, 2007 6:54 pm UTC

Re: Coding: Fleeting Thoughts

Postby hotaru » Tue Dec 05, 2017 4:08 am UTC

Qaanol wrote:
Xeio wrote:Todays Google doodle is nice.

They appear to measure “shortest solution” in terms of fewest instructions, not least movement.

also, the "shortest solution" is wrong for the last one.

edit #1: same for number 4.

edit #2: and number 5.

Spoiler:
number 4, 6 instructions instead of 7: { { forward, forward, right } loop 4 times, left } loop 3 times

number 5, 5 instructions instead of 6: { { forward, right } loop 10 times, left } loop 18 times

number 6, 5 instructions instead of 6: { forward, right, forward, forward } loop 11 times

Code: Select all

factorial product enumFromTo 1
isPrime n 
factorial (1) `mod== 1

speising
Posts: 2078
Joined: Mon Sep 03, 2012 4:54 pm UTC
Location: wien

Re: Coding: Fleeting Thoughts

Postby speising » Tue Dec 05, 2017 10:16 am UTC

hotaru wrote:
Qaanol wrote:
Xeio wrote:Todays Google doodle is nice.

Spoiler:
number 6, 5 instructions instead of 6: { forward, right, forward, forward } loop 11 times

there's a solution in 4. (hint: you're doing something 3 times)

User avatar
Yakk
Poster with most posts but no title.
Posts: 11050
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Mon Dec 11, 2017 9:06 pm UTC

Xanthir wrote:You might be interested in looking into Streams, like RxJS and the like. The concepts are very similar, and they've thought about many of these problems already.

I've poked at them, but documentation tends to focus on *using* them, not how they where written and how they work.

A lot of what they are doing -- passing key-value property bundles around -- isn't what I'm aiming for. I want type safety and argument descriptions and the like.

Also, they seem OO; and I've been trying to be functional instead of OO. My pipe/source/sink elements are fundamentally *functions* (with a bit of type-dressing), RxJS seems to have *objects* as its fundamental unit. I can see how to solve it with objects; maybe I should just bite the bullet, create real objects with a richer API, and just have factory-from-function when users don't care?
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Tub
Posts: 321
Joined: Wed Jul 27, 2011 3:13 pm UTC

Re: Coding: Fleeting Thoughts

Postby Tub » Tue Dec 12, 2017 12:39 pm UTC

Yakk wrote:I've poked at them, but documentation tends to focus on *using* them, not how they where written and how they work.

Well, they are open source...

I'm not entriely sure I read your notation right, but you seem to try mixing the two fundamental control principles of streams: push and pull.

In push-based systems, the Sources will generate new data whenever they feel like it, and call into their sources with the new data. This is useful in complex event processing, e.g. when data arrives in near real-time from sensors.

In pull-based systems, the call stack is the other way around. You call into your sources, requesting a piece of data, which in turn request data from their sources etc. This is commonly called an operator graph in database systems. Because your sink will only request as much data as it needs, this allows early aborting of queries suffixed with 'LIMIT n'. It's also used anywhere you get an Iterator.
Pull-based systems require all data to be available when requested, e.g. in memory or on a local disk. You can kinda get around that in a language with async/await or something, but that'll only turn your source<T> into source<Promise<T>>, and then you're usually better off with a push-based system.

In push-based systems, no pipe or sink needs to know about its sources. In a pull-based system, no pipe or source needs to know about its sinks. Your definitions of source and sink seem to imply a push-based system, but then your pipes get a reference to both a source and a sink, and that seems weird.

You can try to create a system that somehow does both (rxjs appears to do so), but (from a cursory glance) rxjs just implements two systems in the same library with similar APIs. Pick one and start with it.


You can do everything functional, but IMHO it's easier to model as a (acyclic, connected, directed) graph of nodes, each being an object with a standardized API, like push<T> or observe<callback<T>> (for push-based) or get<T> or getIterator<T> (for pull-based). Passing the connections in a type-safe way in the constructor just seems cleaner and more robust than mixing connections and data in an arbitrary argument list.

User avatar
Yakk
Poster with most posts but no title.
Posts: 11050
Joined: Sat Jan 27, 2007 7:27 pm UTC
Location: E pur si muove

Re: Coding: Fleeting Thoughts

Postby Yakk » Wed Dec 13, 2017 4:17 pm UTC

Tub wrote:
Yakk wrote:I've poked at them, but documentation tends to focus on *using* them, not how they where written and how they work.

Well, they are open source...

And so is Linux. That doesn't mean it is reasonable to learn how Linux kernel overall architecture design by reading the source. ;)
I'm not entriely sure I read your notation right, but you seem to try mixing the two fundamental control principles of streams: push and pull.

So, I'm using continuation passing style.

A function that returns X instead takes a function that is passed X.

This permits a few things. First, a function can return *more than once*, or zero times. Second, you can return *view types* into stack data.

The first seems useful for streaming operations; sometimes one unit of data becomes 3 or 0. The second gets rid of an inefficiency in returning data of statically dynamic size (static in implementation; dynamic in interface).

In push-based systems, the Sources will generate new data whenever they feel like it, and call into their sources with the new data. This is useful in complex event processing, e.g. when data arrives in near real-time from sensors.

I think you mean "call into their sinks".
In pull-based systems, the call stack is the other way around. You call into your sources, requesting a piece of data, which in turn request data from their sources etc. This is commonly called an operator graph in database systems. Because your sink will only request as much data as it needs, this allows early aborting of queries suffixed with 'LIMIT n'. It's also used anywhere you get an Iterator.
Pull-based systems require all data to be available when requested, e.g. in memory or on a local disk. You can kinda get around that in a language with async/await or something, but that'll only turn your source<T> into source<Promise<T>>, and then you're usually better off with a push-based system.

Mine is a pull-based system with continuation passing style.
In push-based systems, no pipe or sink needs to know about its sources. In a pull-based system, no pipe or source needs to know about its sinks. Your definitions of source and sink seem to imply a push-based system, but then your pipes get a reference to both a source and a sink, and that seems weird.

The sink is only known about insofar as I *return through it*.

Instead of T(U), we get void(U, continuation(T)). The transformation of (U->T) to (U->(T->void)->void) is a brainless one. Except we get (U->T*) to (U->(T->void)->void) out of it for free.
You can do everything functional, but IMHO it's easier to model as a (acyclic, connected, directed) graph of nodes, each being an object with a standardized API, like push<T> or observe<callback<T>> (for push-based) or get<T> or getIterator<T> (for pull-based). Passing the connections in a type-safe way in the constructor just seems cleaner and more robust than mixing connections and data in an arbitrary argument list.

There isn't any data in my system. Pipe's don't *take data*, they take connections.

Source, Sink and Pipe are 3 kinds of graphs. They can be composed in certain ways. If you connect a source to a pipe, you get a source of the pipe's output. If you connect a pipe to a sink, you get a sink of the pipe's input. If you connect a pipe to a pipe, you get a pipe.

If you connect a source to a sink, you get an executable object. Running it executes the graph.

You can work with any of the 3 types of graphs. You can consume data from a source by passing in a continuation. You can provide data to a sink. You can do both (at once) to a pipe to process data.

The fact that nodes are also sinks/sources or pipes simply reflects that a node is a graph.

The syntax ends up being pretty clean, and as I don't type erase unless requested everything can be inlined.

What I'm missing is a clean way to handle multiple inputs/outputs. Reading the react streams has given me a few thoughts, but their system seems really tied to timers and async and property bags and all that mess.
One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision - BR

Last edited by JHVH on Fri Oct 23, 4004 BCE 6:17 pm, edited 6 times in total.

Tub
Posts: 321
Joined: Wed Jul 27, 2011 3:13 pm UTC

Re: Coding: Fleeting Thoughts

Postby Tub » Wed Dec 13, 2017 11:12 pm UTC

Yakk wrote:
Tub wrote:Well, they are open source...

And so is Linux. That doesn't mean it is reasonable to learn how Linux kernel overall architecture design by reading the source. ;)

Eh, if you wish to save time, just skip the comments. ;)

Yakk wrote:Source, Sink and Pipe are 3 kinds of graphs. They can be composed in certain ways. If you connect a source to a pipe, you get a source of the pipe's output. If you connect a pipe to a sink, you get a sink of the pipe's input. If you connect a pipe to a pipe, you get a pipe.

Yeah, but that's exactly the part where I think you're overcomplicating things. You wish for the ability to extend the graph on all ends, and that's going to end up more complicated than a graph that you can recursively create bottom-up, or maybe iteratively create bottom-up in a long chained method call. A single data type for nodes ("source" when pull-based) should suffice, and it makes design and reasoning a lot simpler.

I've worked with quite a bit of operator graphs, iterator implementations, workflow systems and other "stream" implementations, and in exactly none of them did anyone see a requirement to create three different data types, or to extend graphs on multiple sides. I'm sure it can be done (though I'm too rusty to do it in a functional language), but I wonder why one should.

Yakk wrote:What I'm missing is a clean way to handle multiple inputs/outputs.

Yeah, that seems easier in an OO language where you just slap an array of inputs/outputs on each object. It's also easy in a functional language if you restrict yourself to constructing the graph from one side. What you're trying in a functional language.. I see where your troubles come from.

User avatar
Xeio
Friends, Faidites, Countrymen
Posts: 5089
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\
Contact:

Re: Coding: Fleeting Thoughts

Postby Xeio » Thu Dec 14, 2017 4:40 am UTC

The heck... how the hell does an execution plan for an Index Scan read 2+ million rows from an index for a table with only 10 thousand rows?

And then when I drop and recreate the index the problem magically goes away...

User avatar
Thesh
Made to Fuck Dinosaurs
Posts: 5519
Joined: Tue Jan 12, 2010 1:55 am UTC
Location: Colorado

Re: Coding: Fleeting Thoughts

Postby Thesh » Thu Dec 14, 2017 5:26 am UTC

Inaccurate statitics. Dropping and recreating the index rebuilds the statistics too. Why so many rows? Depends. If it's a loop, it could be reading the same rows multiple times and then filtering them out. Sometimes, I've seen SQL Server do a cartesian join and then filter when it could have done a merge join because fuck you that's why.

SQL Server can be quirky with it's execution plans. For example, when it joins it always assumes some rows won't join; we had a huge staging database where we had several million rows per day per table, each record having a 1:1 correspondence, with hundreds of tables. After 5 or 6 joins, it will assume only 1 row will come back and optimize the execution plan accordingly - index seek, nested loops. The only way I could fix it was join hints.
Honesty replaced by greed, they gave us the reason to fight and bleed
They try to torch our faith and hope, spit at our presence and detest our goals

User avatar
Xeio
Friends, Faidites, Countrymen
Posts: 5089
Joined: Wed Jul 25, 2007 11:12 am UTC
Location: C:\Users\Xeio\
Contact:

Re: Coding: Fleeting Thoughts

Postby Xeio » Thu Dec 14, 2017 5:53 am UTC

No loop, just a "where in" clause which apparently caused the actual read rows to grow exponentially as the number of items in the list grew.

Granted, it was still slow-ish after the index rebuild so I twerked the way I was requesting the data from the entity framework. Now it just does some voodoo magic with an IQuerable rather than a list which turned the "where in" clause into a straight inner join which is neat.

So I learned a little about leaving something as IQueryable that I don't necessarily need.

User avatar
Thesh
Made to Fuck Dinosaurs
Posts: 5519
Joined: Tue Jan 12, 2010 1:55 am UTC
Location: Colorado

Re: Coding: Fleeting Thoughts

Postby Thesh » Thu Dec 14, 2017 6:21 am UTC

I meant a nested loop join in the execution plan; they can result in the table being read multiple times in some situations. For example, the outer table has a 10,000 rows, the subquery returns 200 rows, and for each record in the outer table it reads all 200 rows from the inner table (subquery result), joins them, and then filters the result.
Honesty replaced by greed, they gave us the reason to fight and bleed
They try to torch our faith and hope, spit at our presence and detest our goals


Return to “Coding”

Who is online

Users browsing this forum: Google [Bot] and 8 guests