From nicholasdrozd at gmail.com Tue Mar 2 00:25:08 2021 From: nicholasdrozd at gmail.com (Nicholas Drozd) Date: Mon, 1 Mar 2021 18:25:08 -0600 Subject: Busy Beaver stuff Message-ID: Hello Friends! I've missed our meetings tremendously this past year. Hopefully we will be together again soon. For now, I'd like to share some recent research I've been doing. There are four sections to this message: - BACKGROUND - NEW STUFF - BLOG POSTS - HELP WANTED <-- exciting open problems !!! cash bounty $$$ * BACKGROUND The Busy Beaver problem, first posed in 1962, asks: what is the longest that a Turing machine of N states can run before halting when started on the blank tape? The function that maps the number of states to the number of steps executed grows astoundingly fast: 1 : 1 2 : 6 3 : 21 4 : 107 5 : 47176870 (?) The question mark indicates the value has not been formally proved. It's known that a 5-state TM program exists that halts in 47176870 steps, but nobody knows how to prove that there aren't any programs that run longer than that. In the 6-state case, there is a program that has been claimed to halt in approximately 7.4 ? 10^36534 steps. That's too long for me to personally verify. This sequence grows incomputably fast, faster than any computable function. If you could somehow obtain values for it generally, you could solve the Halting Problem, and thereby also solve a huge range of open math problems, like Goldbach's Conjecture. * NEW STUFF The classic Busy Beaver problem looks for how long a program can run before halting. Now, in order to halt a Turing machine must execute an explicit halt instruction. But this means that a program has to waste a whole instruction in order to halt. Halting is just one instance of a "termination condition", and faster-growing sequences can be defined by requiring different termination conditions. Some of these conditions are fairly technical, but a simple one is the "blank tape condition": how long can a program run, when started on the blank tape, before leaving the tape blank? This is the Blanking Beaver problem, and it grows faster than the classic Busy Beaver sequence: 1 : 1 2 : 8 3 : 34 4 : 66345 (?) This is a novel question to ask, and in trying to answer it I have been able to discover Turing machine programs that AFAIK nobody had ever looked at before. Of course, I didn't write any of these programs myself. That would require an inconceivable level of cleverness. Instead, they were found by exhaustive search. * BLOG POSTS I've written a bunch of posts about various aspects of this research. Perhaps the aspect that will be most interesting to this group is the Turing machine "simulator" I developed. Initially I was using a simulator written in Python, but it turned out to be too slow. Searching exhaustively means generating a huge list of Turing machine programs and then running them for some specified number of steps, and this requires a fast simulator. I thought about it for a while and came up with a plan for a simulator written in C. And let me tell you, this simulator is *fast as shit*. Like, crazy fast. I don't see how a simulator written in a high-level language could be any faster. The cherry on top is that the whole thing is written as an elaborate stack of preprocessor macros, so I know that Joe will hate it! (Oh, and it relies critically on a non-standard GCC extension -- that makes two cherries on top!) Post discussing this simulator and the philosophical considerations that motivated its development, plus links to the current code: - https://nickdrozd.github.io/2020/09/14/programmable-turing-machine.html - https://github.com/nickdrozd/pytur/blob/master/machines/machine.h - https://github.com/nickdrozd/pytur/blob/master/machines/5-2.c As discussed above, the thrust of this research is to replace halting in the Busy Beaver problem with other termination conditions. The simplest of these is the blank tape condition. Two more complicated termination conditions are what I'm calling "quasihalting" and "Lin recurrence". Discussion of these conditions and their associated sequences: - https://nickdrozd.github.io/2021/01/14/halt-quasihalt-recur.html - https://nickdrozd.github.io/2021/02/14/blanking-beavers.html - https://nickdrozd.github.io/2021/02/24/lin-recurrence-and-lins-algorithm.html One of the problems with the exhaustive search strategy for program discovery is that the number of programs of N states grows on the order of N^N. This is very very bad. Here is a post about the "spaghetti code conjecture", which aims to reduce the search space by applying graph theory: - https://nickdrozd.github.io/2021/01/26/spaghetti-code-conjecture.html Some values of the Busy Beaver function have been "proved" and some have not. Here's a discussion of what "proof" means in this case: - https://nickdrozd.github.io/2020/12/15/lin-rado-proof.html Finally, here is a philosophical discussion of whether or not the Busy Beaver sequence is, as is often claimed, "perfectly well-defined" in spite of its incomputability: - https://nickdrozd.github.io/2020/10/15/busy-beaver-well-defined.html * HELP WANTED YOU, TODAY, can make tangible progress in the field of Busy Beaver research! Well, maybe not today, but soon! There is lots of low-to-medium-hanging fruit just waiting to be picked. Back in January I was able to convince Boyd Johnson to implement a Lin recurrence-detecting Turing machine simulator in Rust, and within weeks he was able to find a TM program that enters into Lin recurrence after 158491 steps. That is an extraordinary find, and it is a program that presumably nobody had ever looked at before -- a novel contribution to human knowledge! Here is Boyd's code: - https://github.com/boydjohnson/lin-rado-turing A top priority for me is to find a champion 5-state Blanking Beaver champion. It was discovered in the late 80s that there is a program that halts after 47176870 steps. I believe that there is a non-halting program that ends up wiping its tape blank in some time longer than that, but I haven't been able to find one. I also believe that finding one is within the grasp of someone with mildly powerful hardware. Another open problem is to discover a non-trivial quasihalting program that is NOT Lin recurrent. This is a somewhat technical problem; see the blog posts above for details. More open problems are listed in those many blog posts. What's in it for you? Fame, for one. You will forever be known as the discoverer of some new Turing machine program. There is also the intellectual thrill of working on a problem that is provably insurmountable in the general case. If you are looking for a challenge to your engineering and hacking skills, here it is. Software, hardware, theory, practice, all of it will be pushed to the limit. And last but not least... $$$ CASH $$$ I am hereby offering a CASH BOUNTY of 200 USD to the first person who finds a TM program that, when started on the blank tape, ends up with a blank tape after (and not before) 50000000 (fifty million) steps. In computability theory terminology, the problem of determining whether or not a program reaches some configuration in a specified number of steps is known as "decidable", so there should be no issues with judging correctness. A winning entry consists of a program and a number of steps. The program will be run for that many steps starting with the blank tape, and if the tape gets blanked after (and not before) 50000000 steps, the submitter will get 200 smackeroos. ($200 is not a lot of money in the grand scheme of things, but I hope that it's enough to show that I take this enterprise seriously. Cash to be paid in person in Minneapolis.) Of course, I am open to any questions and comments, and I am also available to help with anyone's searching. As you may have surmised from all this text, I have found working on the Busy Beaver problem to be immensely rewarding and engaging, and I think some others here may find that they feel the same way. From joe at begriffs.com Sat Mar 6 01:23:21 2021 From: joe at begriffs.com (Joe Nelson) Date: Fri, 5 Mar 2021 19:23:21 -0600 Subject: Busy Beaver stuff In-Reply-To: References: Message-ID: <20210306012321.GH3869@begriffs.com> Nicholas Drozd wrote: > I thought about it for a while and came up with a plan for a simulator > written in C. ... The cherry on top is that the whole thing is > written as an elaborate stack of preprocessor macros, so I know that > Joe will hate it! Your IOCCC studies are paying off! ;) What's interesting is you've essentially made a compiler from "Turing language" like 1RB 0LC 1LD 0LA 1RC 1RD 1LA 0LD to machine code. Your blog post mentioned initially going Python -> C -> Native, and then you refined it to Preprocessor -> C -> Native. If you're iterating through a bunch of random Turing programs, that compilation process introduces some inefficiency, because you're spawning the compiler process fresh for each program, and reading and writing to disk. A faster way might be in-memory Just In Time (JIT) compiling. Both LLVM and GCC provide a library interface that you can link with to do compilation without spawning a separate process. You build the assembly language in memory, and GCC/LLVM compiles it to machine code also in memory, and essentially gives you a function pointer to the generated code. You can then just call the function. The library interfaces are daunting, but it sounds like an interesting project: * https://llvm.org/docs/tutorial/index.html#building-a-jit-in-llvm * https://gcc.gnu.org/onlinedocs/jit/ LLVM has an Intermediate Representation (IR) assembly language that is more friendly to program in than x86. https://llvm.org/docs/LangRef.html#introduction As a first step, it would be interesting to try to simply turn one of those Turing programs into LLVM IR by hand, and compile it on the command line to see if we can figure out how that works. After that we could see how to build the same IR in memory and JIT it. I'm curious whether the speed of the compiled program compensates for the compilation overhead. Going further down the rabbit hole, we could run two threads, where one JITs and the other runs a one-size-fits-all simulator, and whichever thread finishes first cancels the other one. From joe at begriffs.com Sat Mar 6 05:36:09 2021 From: joe at begriffs.com (Joe Nelson) Date: Fri, 5 Mar 2021 23:36:09 -0600 Subject: Busy Beaver stuff In-Reply-To: <20210306012321.GH3869@begriffs.com> References: <20210306012321.GH3869@begriffs.com> Message-ID: <20210306053609.GI3869@begriffs.com> Joe Nelson wrote: > Going further down the rabbit hole, we could run two threads, where > one JITs and the other runs a one-size-fits-all simulator, and > whichever thread finishes first cancels the other one. I wrote up a one-size-fits-all simulator tonight https://github.com/begriffs/turing How come many of the programs in your blog post don't have a halt state? https://nickdrozd.github.io/2020/10/09/beeping-busy-beaver-results.html From nicholasdrozd at gmail.com Sun Mar 7 20:53:57 2021 From: nicholasdrozd at gmail.com (Nicholas Drozd) Date: Sun, 7 Mar 2021 14:53:57 -0600 Subject: Busy Beaver stuff In-Reply-To: <20210306053609.GI3869@begriffs.com> References: <20210306012321.GH3869@begriffs.com> <20210306053609.GI3869@begriffs.com> Message-ID: > How come many of the programs in your blog post don't have a halt state? My focus has been on programs that don't halt, but that instead reach some other termination condition. For details, see: - https://nickdrozd.github.io/2021/01/14/halt-quasihalt-recur.html - https://nickdrozd.github.io/2021/02/14/blanking-beavers.html > I wrote up a one-size-fits-all simulator tonight According to my philosophical considerations, it shouldn't be possible to create a truly fast TM simulator using a general while-loop. The TM program itself doesn't say to do a while-loop; it just says to do what it says to do. The while-loop is our way of making sense of the execution. And yet, you've used a while-loop and come up with a faster simulator! $ time echo "1RB 1LC 1RC 1RB 1RD 0LE 1LA 1LD 1RH 0LA" | ./test/joetur Finished in 47176870 steps real 0m0.366s user 0m0.365s sys 0m0.002s $ time echo "1LC1RC1RB1RD0LE1LA1LD1RH0LA" | ./pytur/machines/5-2.run 1 | 1RB 1LC 1RC 1RB 1RD 0LE 1LA 1LD 1RH 0LA | 47176868 47164580 47176869 47164582 47176870 done real 0m0.522s user 0m0.519s sys 0m0.006s Now, mine collects some extra data, and perhaps that slows it down some. Still, what do you suppose accounts for the difference? Three possibilities jump out at me: 1. The computed-goto operation is not as fast as I thought it would be. 2. Your simulator calls fgets in clumps, whereas mine reads in the program one getc call at a time. 3. In my simulator the instructions are collected into a big pile of separate ints, whereas yours has a program array. You're a more sophisticated C programmer than I am, so maybe that's the difference. Or, it could be that my philosophy is wrong. Now we have a way to test the hypothesis. From joe at begriffs.com Mon Mar 8 03:05:27 2021 From: joe at begriffs.com (Joe Nelson) Date: Sun, 7 Mar 2021 21:05:27 -0600 Subject: Busy Beaver stuff In-Reply-To: References: <20210306012321.GH3869@begriffs.com> <20210306053609.GI3869@begriffs.com> Message-ID: <20210308030527.GJ3869@begriffs.com> Nicholas Drozd wrote: > And yet, you've used a while-loop and come up with a faster simulator! Huh, I'm actually surprised by that. I wasn't going for any special tricks in my implementation, just what seemed to be boring and clear. > Now, mine collects some extra data, and perhaps that slows it down > some. Still, what do you suppose accounts for the difference? Three > possibilities jump out at me: Maybe the data collection is part of it. Also, my Makefile uses -O3 for aggressive compiler optimization. Is yours compiled that way? > 1. The computed-goto operation is not as fast as I thought it would be. I wonder if conditional gotos are unpredictable and inhibit instruction pipelining? https://en.wikipedia.org/wiki/Instruction_pipelining#Branches In my code, the while(true) loop is very predictable, it's always the same chunk of instructions running. It uses the lookup table to see what state it's in, rather than going to a separate code area. Also the program[][] array is never modified after loading it, so perhaps the lookup program[tape[head]][state] ends up hitting cache most of the time. Anyone know of a good tool to measure cache misses? > 2. Your simulator calls fgets in clumps, whereas mine reads in the > program one getc call at a time. I wonder how much of the execution time comes from the input gathering. My intuition is that with a 47-million-step turing program, the getc won't be very significant, but I guess that would require we measure those sections of our code with a high-precision function like clock_gettime(CLOCK_REALTIME, ...) https://pubs.opengroup.org/onlinepubs/9699919799/functions/clock_getres.html > 3. In my simulator the instructions are collected into a big pile of > separate ints, whereas yours has a program array. I don't know how that ends up working. Might be instructive to compare the generated assembly code. My program is only 200 lines in x86-64 assembly, as generated by gcc: https://godbolt.org/z/57r99K From nicholasdrozd at gmail.com Mon Mar 8 17:08:02 2021 From: nicholasdrozd at gmail.com (Nicholas Drozd) Date: Mon, 8 Mar 2021 11:08:02 -0600 Subject: Busy Beaver stuff In-Reply-To: <20210308030527.GJ3869@begriffs.com> References: <20210306012321.GH3869@begriffs.com> <20210306053609.GI3869@begriffs.com> <20210308030527.GJ3869@begriffs.com> Message-ID: I just reran it with the instrumentation ripped out, and things are right in the world again: $ time echo "1LC1RC1RB1RD0LE1LA1LD1RH0LA" | ./pytur/machines/5-2.run 47176870 done real 0m0.262s user 0m0.258s sys 0m0.006s Still pretty close, but enough of an edge to justify my philosophical intuition for now. A real performance shootout would require a strict set of specifications. And as you mention, fancy pipeline and cache optimizations complicate things. A comparison of the instructions used in the compiled assembly might lead one to believe (correctly) that these programs have different approaches to control flow: Joe 9 add 1 and 9 call 12 cmp 4 imul 3 ja 5 je 7 jmp 3 jne 11 lea 41 mov 8 movsx 6 movzx 12 pop 3 push 5 ret 2 sete 2 setne 3 sub 2 test 18 xor Nick 20 add 31 call 15 cmp 6 je 12 jmp 9 lea 149 mov 9 movsx 9 movzx 6 push 9 setne 19 sub 2 xor From joe at begriffs.com Thu Mar 11 05:05:07 2021 From: joe at begriffs.com (Joe Nelson) Date: Wed, 10 Mar 2021 23:05:07 -0600 Subject: Busy Beaver stuff In-Reply-To: References: <20210306012321.GH3869@begriffs.com> <20210306053609.GI3869@begriffs.com> <20210308030527.GJ3869@begriffs.com> Message-ID: <20210311050507.GM3869@begriffs.com> Nicholas Drozd wrote: > I just reran it with the instrumentation ripped out, and things are > right in the world again: > > $ time echo "1LC1RC1RB1RD0LE1LA1LD1RH0LA" | ./pytur/machines/5-2.run > 47176870 > done > > real 0m0.262s > user 0m0.258s > sys 0m0.006s OK, good to know the more direct approach is faster. Since your computer is our official benchmark, can you time this version? https://github.com/begriffs/turing/blob/master/1RB1LC1RC1RB1RD0LE1LA1LD1RH0LA.c I tried to remove anything that isn't related to just grinding through the tape. No loading, no stats, no bounds checking, no output. Might want to double check for typos in the code as well... I was curious if anyone invented insane trickery to go even faster, and found a paper that uses probabilistic programming: https://core.ac.uk/download/pdf/82436988.pdf Not sure if it's of interest merely in regard to an esoteric model of computation, or whether it's something we could implement on a "normal" computer. It has an appendix with pseudo code. Their claim is that if T is the number of steps a machine takes to halt, then "a probabilistic RAM can simulate a deterministic 1TTM in expected time O(T/ln ln T)." A catch is that their approach sometimes goes slower than a straightforward simulation if you roll the dice wrong. They suggest running the classic and probabilistic versions in parallel and using the results of whichever finishes first, so that a bad worst-case time is prevented. > And as you mention, fancy pipeline and cache optimizations complicate > things. I was probably full of crap about that, as your revised program shows. Was just grasping for an explanation. :) From nicholasdrozd at gmail.com Wed Mar 17 03:20:10 2021 From: nicholasdrozd at gmail.com (Nicholas Drozd) Date: Tue, 16 Mar 2021 22:20:10 -0500 Subject: Busy Beaver stuff In-Reply-To: <20210311050507.GM3869@begriffs.com> References: <20210306012321.GH3869@begriffs.com> <20210306053609.GI3869@begriffs.com> <20210308030527.GJ3869@begriffs.com> <20210311050507.GM3869@begriffs.com> Message-ID: > Since your computer is our official benchmark, can you time this > version? $ cc -O3 joefast.c -o joefast && time ./joefast real 0m0.050s user 0m0.049s sys 0m0.001s Of course, without any measurements or output, it's hard to verify that it actually did anything. Adding a simple count variable that is incremented at the top of every label shows that it does indeed do something: $ cc -O3 joefast.c -o joefast && time ./joefast Steps: 47176870 real 0m0.061s user 0m0.057s sys 0m0.004s This is the understanding of Turing machines as specialized single-purpose appliances, as opposed to programmable general-purpose computers. Clearly a lot of speed can be gained by discarding all that unneeded functionality, just like race cars can gain speed by not needing to be able to turn right. (Okay, that's a myth, but you get the point!) From chuck at cpjj.net Thu Mar 25 21:07:36 2021 From: chuck at cpjj.net (Chuck Jungmann) Date: Thu, 25 Mar 2021 16:07:36 -0500 Subject: Projects on which I have been working. Message-ID: I have been working on a few ideas while developing a thesaurus app.? I miss having in person meetings and discussions, so I'm presenting these ideas for your comments, if anyone is interested. 1. In response to comments on how my readargs library (https://github.com/cjungmann/readargs) would not compile on BSD, I have been experimenting with makefiles with Linux/BSD compatibility as a primary aim.? I made a project, https://github.com/cjungmann/makefile_patterns , that includes several little makefile fragments that are meant to be included in a larger makefile in simple ways.? I am attempting to only use portable features, making liberal use of the != operator for assigning the results of a shell command to a make variable.? I realize that in many cases, it would be much easier just to call a shell script, but I'm a masochist, and I'm looking for the limits of what's possible entirely in make. 2. In https://github.com/cjungmann/c_patterns, I'm playing with a different idea of a library.? Rather than making a binary library, I am creating several source files that each accomplish a single task.? There is a makefile fragment (developed in my makefile_patterns project) that, when included in another project, will download/clone the project and make links to requested modules that will then be compiled and linked along with the new project's sources. 3. Finally, https://github.com/cjungmann/th is the thesaurus project.? It is a console program that is quicker to access and navigate than the online thesauri, which compensates somewhat for being less organized.? This project incorporates the above approaches.? I've tested it on Manjaro and FreeBSD where it successfully compiles and runs for me.? The project started as a bit of an experiment, so there are several debugging options that you can ignore. I have to run to the store for dinner ingredients, so you're all saved from my trademark rambling and never-ending email.? I hope this email provokes some discussion. Chuck Jungmann From joe at begriffs.com Mon Mar 29 01:54:30 2021 From: joe at begriffs.com (Joe Nelson) Date: Sun, 28 Mar 2021 20:54:30 -0500 Subject: Projects on which I have been working. In-Reply-To: References: Message-ID: <20210329015430.GW3869@begriffs.com> > I have been working on a few ideas while developing a thesaurus app.? > I miss having in person meetings and discussions, so I'm presenting > these ideas for your comments, if anyone is interested. Looking forward to having hack nights again! I like how you're developing these experiments in connection with a project that applies them. > 1. https://github.com/cjungmann/makefile_patterns ... makefile > fragments that are meant to be included in a larger makefile in simple > ways. Can you tell me more about each pattern? Some of it is new to me. == make_need_ld_so.mk == This is to check whether an intended destination for installing a shared library is appropriate on a system? I've always kind of wondered about this -- do different Unices or even distros of Linux have different preferred places to install libraries? If so, I wonder if a project's makefile should be responsible only for building its own binaries/libraries, and then the project maintainer should create official packages per OS to do installation. I know "./configure && make && sudo make install" is standard, but just wondering if that last command is better left for a package manager. == make_db5.mk == This is fascinating, I've used SQLite as an embedded database, but I didn't know BSDs come with an API for a simple database! Looks like Linux provides it too in Glibc <2.2, and thereafter in libdb. http://man.openbsd.org/dbopen.3 https://www.man7.org/linux/man-pages/man3/dbopen.3.html Is Berkeley DB generally your go-to for storing data in your C programs? == make_c_patterns.mk == This one seems like it would traditionally be part of some sort of configure script rather than having targets and dependencies in a makefile. Basically all the targets depend on c_patterns being cloned, and that's about it. BTW, I think you can portably rely on substitutions like $(CP_NAMES:=.o) rather than shelling out to sed. https://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html#tag_20_76_13_05 Macro expansions using the forms $(string1 [: subst1 =[ subst2 ]]) or ${ string1 [: subst1 =[ subst2 ]]} can be used to replace all occurrences of subst1 with subst2 when the macro substitution is performed. The subst1 to be replaced shall be recognized when it is a suffix at the end of a word in string1 (where a word, in this context, is defined to be a string delimited by the beginning of the line, a , or a ). == make_static_readargs.mk == Perhaps this pattern is better solved with pkg-config? You could add a libreadargs.pc file that would be installed with readargs, and this file would describe how to link to the library. Pkg-config supports both static and dynamic linking, so programs wishing to link to readargs statically would pass `--static`. https://people.freedesktop.org/~dbn/pkg-config-guide.html == make_confirm_libs.mk == This sounds like a job for pkg-config as well (checking whether libraries are installed). https://begriffs.com/posts/2020-08-31-portable-stable-software.html#build-with-pkg-config Maybe I'm too enthusiastic about pkg-config. :) Let me know if I'm overlooking something. > I am attempting to only use portable features, making liberal use of > the != operator for assigning the results of a shell command to a make > variable. Interesting, I assumed != was GNU only, but looks like it's in BSD as well: https://man.openbsd.org/make#VARIABLE_ASSIGNMENTS > I realize that in many cases, it would be much easier just to call a > shell script, but I'm a masochist, and I'm looking for the limits of > what's possible entirely in make. The way I choose make vs shell: when I'm creating targets which rely on inputs, I use make. If I'm running commands to move/download files or test things then I use shell. Make's value to me is in knowing when to (re)build files based on dependencies. > 2. In https://github.com/cjungmann/c_patterns, I'm playing with a > different idea of a library.? Rather than making a binary library, I > am creating several source files that each accomplish a single task. That's a cool idea, trying to find tasks that are compact enough to isolate to a single file. == commaize.c == The POSIX extension to C adds a "'" flag character to printf (and snprintf etc) for this purpose. https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html (The .) The integer portion of the result of a decimal conversion ( %i, %d, %u, %f, %F, %g, or %G ) shall be formatted with thousands' grouping characters. For other conversions the behavior is undefined. The non-monetary grouping character is used. The non-monetary grouping character is locale-dependent (in the US we'd use a comma, but in Germany they'd use a dot). Thus you have to setlocale for the extended printf functionality to work. For example, try this program: #define _POSIX_C_SOURCE 200112L #include #include int main(void) { setlocale(LC_NUMERIC, ""); printf("%'d\n", 1024*1024*1024); return 0; } It outputs 1,073,741,824. == get_keypress.c == It was interesting to see this file. I thought the only way to portably customize the terminal was with the Curses interface (ncurses etc). But looks like termios.h that you're using is POSIX. https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/termios.h.html == init_struct_array.c == I don't really understand what this file is all about. Looks like some internal experiments? > There is a makefile fragment (developed in my makefile_patterns > project) that, when included in another project, will download/clone > the project and make links to requested modules that will then be > compiled and linked along with the new project's sources. What's the motivation for copying files into a project and then compiling/linking, vs putting the object files into a static library and linking your program with that library? IIRC, the linker adds only referenced symbols to the final executable, so binary bloat doesn't appear to be a concern in either approach. > 3. Finally, https://github.com/cjungmann/th is the thesaurus project.? > It is a console program that is quicker to access and navigate than > the online thesauri, which compensates somewhat for being less > organized. Looks cool from what I can see in the readme! > I've tested it on Manjaro and FreeBSD where it successfully compiles > and runs for me. Just tried building on Mac and ran into a problem (on commit 4644848d): $ make cc -o th -ldb ld: library not found for -ldb clang: error: linker command failed with exit code 1 (use -v to see invocation) make: *** [th] Error 1 I added the dependency with macports: sudo port install db62 It still didn't build, so I added this to the Makefile: LDFLAGS = -L/opt/local/lib/db62 However it's still not working: $ make cc -o th -L/opt/local/lib/db62 -ldb Undefined symbols for architecture x86_64: "_main", referenced from: implicit entry/start for main executable ld: symbol(s) not found for architecture x86_64 clang: error: linker command failed with exit code 1 (use -v to see invocation) make: *** [th] Error 1 From nicholasdrozd at gmail.com Mon Mar 29 15:17:08 2021 From: nicholasdrozd at gmail.com (Nicholas Drozd) Date: Mon, 29 Mar 2021 10:17:08 -0500 Subject: Busy Beaver stuff In-Reply-To: <20210311050507.GM3869@begriffs.com> References: <20210306012321.GH3869@begriffs.com> <20210306053609.GI3869@begriffs.com> <20210308030527.GJ3869@begriffs.com> <20210311050507.GM3869@begriffs.com> Message-ID: Here's a fun exercise: try to rewrite Busy Beaver programs using standard structured programming operators. The following code is equivalent to the BB5 program posted earlier, but without using a single goto statement: // ------------------------------------------------------------------- #include #include #include #define TAPELEN (1L << 20) bool tape[TAPELEN]; bool *head = tape + (TAPELEN/2); unsigned int STEPS = 0; #define BLANK !*head #define GO_RIGHT { ++STEPS; head++; } #define GO_LEFT { ++STEPS; head--; } #define PRINT *head = true; #define ERASE *head = false; #define PRINT_AND_GO_RIGHT PRINT; GO_RIGHT; #define PRINT_AND_GO_LEFT PRINT; GO_LEFT; #define ERASE_AND_GO_LEFT ERASE; GO_LEFT; int main(void) { while (1) { if (!BLANK) { GO_LEFT; } else { PRINT_AND_GO_RIGHT; while (!BLANK) GO_RIGHT; PRINT_AND_GO_RIGHT; } if (BLANK) { PRINT_AND_GO_RIGHT; while (!BLANK) GO_LEFT; PRINT_AND_GO_LEFT; continue; } ERASE_AND_GO_LEFT; if (!BLANK) { ERASE_AND_GO_LEFT; continue; } PRINT_AND_GO_RIGHT; printf("%d\n", STEPS); return EXIT_SUCCESS; } } // ------------------------------------------------------------------- After running that, you should see the magic number 47176870. Admittedly it's still hard to see how the program "works" or what it "does", but at least the control flow is transparent. From chuck at cpjj.net Wed Mar 31 22:35:52 2021 From: chuck at cpjj.net (Chuck Jungmann) Date: Wed, 31 Mar 2021 17:35:52 -0500 Subject: Projects on which I have been working. In-Reply-To: <20210329015430.GW3869@begriffs.com> References: <20210329015430.GW3869@begriffs.com> Message-ID: I had hoped to start a discussion about how everyone keeps track of their work.? That's how I started with all the /xxx_patterns/ repositories. I can't count how many times I encounter coding challenges that I know I solved once-upon-a-time but can't remember how I did it, and can't find the old code.? I have tried different things over the years, but my latest idea involves my repositories with the "_patterns" suffix, where I have working code rather than the incomplete explanations for things I barely understand that I used to write.? Being on github means I can still find it despite having moved on to a new computer. I'm curious how others handle the problem of finding or keeping track of old solutions. *makefile_patterns* Directly addressing the makefile fragments, they are more experimental than my other things.? I am very inexperienced with makefiles, so I'm learning the ropes.? I took it as a challenge when Joe and JuneBug noted that my library wouldn't compile on BSD, so achieving that portability is a big part of my efforts here.? I'm learning that some of my ideas for resolving problems are a bit misguided.? I think that my idea of terminating make after a discovered problem may have some merit.? I welcome any comments. That said,? I'll comment on the makefile scripts. All the makefile fragments begin with /make/ in order that they sort together in a directory listing.? However, as my make scripts proliferate, I have concluded that it's better to segregate them in a /make.d/ directory, and in there the /make_/ prefix may be unnecessary. *make_need_ld_so.mk* *make_need_ld_so.mk* attempts to address the conflict I found on Manjaro.? The recommended directory for user-installed libraries is //usr/local/lib/, but ldconfig can't find it after a reboot unless the directory is found in /ld.so.conf/, and//that directory is not found in /ld.so.conf/ on Manjaro.? There is a further problem, considering BSD portability, that /ldconfig -p/ only works on Linux, so I don't know how to detect if a library is invisible.? I'm punting on this one after a StackOverflow conversation where someone commented that most Linux users would not want a /make install/ to mess with their configuration for a library.? I see his point, but I'm unsure how to help an unsophisticated Linux/BSD user to solve an installation problem. *make_bd5.mk* My thesaurus project uses the Berkeley Database (BDB) that seems to be installed by default on most Linux systems, and version 5.3.28 is installed with*git*, so it even the later version is generally installed on BSD.? It's used by SQLite, so it's performant enough.? I'm experimenting with it because I used a similar database engine (FairCom) back in the 90s.? It requires planning to get good performance by preparing appropriate indexing, but potentially has much better performance as a result of that planning and eliminating the query-parsing step required by SQL queries. My FreeBSD installation did have the Berkeley Database, but it was a much older version that doesn't include BTree tables.? I want to use some features in version 5.3.28, so *make_bd5.mk *tries to detect an inappropriate BDB version and suggest that the user install the new version. I am playing with some object-oriention C patterns, where I'm trying to abstract the database interface to easily switch between BDB and LMDB (lightning-mapped database), which is a similar product.? I'm not committed to BDB, but it is one of the NoSQL databases, which give it some currency. *make_c_patterns.mk* *make_c_patterns.mk* is what I think is my more interesting idea, though I'd like the opinion of people in the group.? Rather than losing track of solutions that I may want later, I decided to do some of the development of my thesaurus project in my c_patterns project, the source files that solve reusable methods to be reintegrated with the thesaurus project through the *make_c_patterns.mk* makefile script.? It helps me to focus on isolating each solution and making the code clear and robust so I won't be tempted to rewrite it in the future.? If you look at the preface of the *make_c_patterns.mk* script, you can the example show how it can include only the necessary subset of the collection for building a given project, even as the collection of patterns grows. *make_static_readargs.mk* You may remember me mentioning the readargs shared library in a previous posting to the list.? As a consequence of the above mentioned SO conversation and the difficulty in confirming access to the library, I decided it would be better to use statically-linked version of readargs for my thesaurus project. Following on the c_patterns example, I thought it would be easier for the end user to simply clone and build the readargs project in a subdirectory of the thesaurus subdirectory, and link the static library to the thesaurus executable.? That makes it automatic for the user, no longer requiring the user to install the library and avoids problems finding the library.? I can easily change the makefile fragment if, haha, readargs becomes popular enough that it might be shared among multiple programs. *make_confirm_libs.mk* *make_confirm_libs.mk *is an early experiment with warning a user of missing libraries.? I've learned some things since I made it, and I wouldn't start something like that going forward. However, in the interest of saving my ideas in case I need them later, I am leaving it in the makefile_patterns project. It's not even a very good solution, given that a user may have put a library in their /home directory.? make_confirm_libs.mk only looks under /usr. The first problem with pkg-config is that I didn't know about it.? I'm still learning about the plumbing of the Linux and BSD environments.? The second possible issue is that I wouldn't expect it to detect libraries installed through /make install/. Maybe I should make my projects create a package to be installed so pkg-config keeps a record of it.? However, I worry about how different distributions use different package managers.? That seems to be an unwelcome escalation of portability considerations. Joe, I appreciate you mentioning your article about pkg-config. I'll read it when I'm done with this response.? I didn't want you to wait any longer for a response to your thoughtful post. A final comment about the makefile_patterns is that I'm trying to see how much I can do in the makefile environment.? I don't pretend they are best practices.? I realize that writing and calling a script is a better solution than the contortions I use to detect things in the makefile fragments.? Writing a config script is also better, especially for saving command-line options to config in a bespoke make file that will enable uninstalls for unconventional installation locations.? I pursue these strange solutions because I have a peculiar inclination to poke at the edges of what I can do with a tool. I'll try to comment in a later post on your questions about the c_patterns source files. I appreciate all the time you spent reviewing and commenting on my code. On 3/28/21 8:54 PM, Joe Nelson wrote: >> I have been working on a few ideas while developing a thesaurus app. >> I miss having in person meetings and discussions, so I'm presenting >> these ideas for your comments, if anyone is interested. > Looking forward to having hack nights again! > > I like how you're developing these experiments in connection with a > project that applies them. > >> 1. https://github.com/cjungmann/makefile_patterns ... makefile >> fragments that are meant to be included in a larger makefile in simple >> ways. > Can you tell me more about each pattern? Some of it is new to me. > > == make_need_ld_so.mk == > > This is to check whether an intended destination for installing a shared > library is appropriate on a system? > > I've always kind of wondered about this -- do different Unices or even > distros of Linux have different preferred places to install libraries? > If so, I wonder if a project's makefile should be responsible only for > building its own binaries/libraries, and then the project maintainer > should create official packages per OS to do installation. > > I know "./configure && make && sudo make install" is standard, but just > wondering if that last command is better left for a package manager. > > == make_db5.mk == > > This is fascinating, I've used SQLite as an embedded database, but I > didn't know BSDs come with an API for a simple database! Looks like > Linux provides it too in Glibc <2.2, and thereafter in libdb. > > http://man.openbsd.org/dbopen.3 > https://www.man7.org/linux/man-pages/man3/dbopen.3.html > > Is Berkeley DB generally your go-to for storing data in your C programs? > > == make_c_patterns.mk == > > This one seems like it would traditionally be part of some sort of > configure script rather than having targets and dependencies in a > makefile. Basically all the targets depend on c_patterns being cloned, > and that's about it. > > BTW, I think you can portably rely on substitutions like $(CP_NAMES:=.o) > rather than shelling out to sed. > > https://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html#tag_20_76_13_05 > Macro expansions using the forms $(string1 [: subst1 =[ subst2 ]]) > or ${ string1 [: subst1 =[ subst2 ]]} can be used to replace all > occurrences of subst1 with subst2 when the macro substitution is > performed. The subst1 to be replaced shall be recognized when it is > a suffix at the end of a word in string1 (where a word, in this > context, is defined to be a string delimited by the beginning of the > line, a , or a ). > > == make_static_readargs.mk == > > Perhaps this pattern is better solved with pkg-config? You could add a > libreadargs.pc file that would be installed with readargs, and this file > would describe how to link to the library. Pkg-config supports both > static and dynamic linking, so programs wishing to link to readargs > statically would pass `--static`. > https://people.freedesktop.org/~dbn/pkg-config-guide.html > > == make_confirm_libs.mk == > > This sounds like a job for pkg-config as well (checking whether > libraries are installed). > https://begriffs.com/posts/2020-08-31-portable-stable-software.html#build-with-pkg-config > > Maybe I'm too enthusiastic about pkg-config. :) Let me know if I'm > overlooking something. > >> I am attempting to only use portable features, making liberal use of >> the != operator for assigning the results of a shell command to a make >> variable. > Interesting, I assumed != was GNU only, but looks like it's in BSD as > well: https://man.openbsd.org/make#VARIABLE_ASSIGNMENTS > >> I realize that in many cases, it would be much easier just to call a >> shell script, but I'm a masochist, and I'm looking for the limits of >> what's possible entirely in make. > The way I choose make vs shell: when I'm creating targets which rely on > inputs, I use make. If I'm running commands to move/download files or > test things then I use shell. Make's value to me is in knowing when to > (re)build files based on dependencies. > >> 2. In https://github.com/cjungmann/c_patterns, I'm playing with a >> different idea of a library.? Rather than making a binary library, I >> am creating several source files that each accomplish a single task. > That's a cool idea, trying to find tasks that are compact enough to > isolate to a single file. > > == commaize.c == > make install > The POSIX extension to C adds a "'" flag character to printf (and > snprintf etc) for this purpose. > > https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html > (The .) The integer portion of the result of a decimal > conversion ( %i, %d, %u, %f, %F, %g, or %G ) shall be formatted with > thousands' grouping characters. For other conversions the behavior > is undefined. The non-monetary grouping character is used. > > The non-monetary grouping character is locale-dependent (in the US we'd > use a comma, but in Germany they'd use a dot). Thus you have to > setlocale for the extended printf functionality to work. > > For example, try this program: > > #define _POSIX_C_SOURCE 200112L > #include > #include > > int main(void) > { > setlocale(LC_NUMERIC, ""); > printf("%'d\n", 1024*1024*1024); > return 0; > } > > It outputs 1,073,741,824. > > == get_keypress.c == > > It was interesting to see this file. I thought the only way to portably > customize the terminal was with the Curses interface (ncurses etc). But > looks like termios.h that you're using is POSIX. > https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/termios.h.html > > == init_struct_array.c == > > I don't really understand what this file is all about. Looks like some > internal experiments? > >> There is a makefile fragment (developed in my makefile_patterns >> project) that, when included in another project, will download/clone >> the project and make links to requested modules that will then be >> compiled and linked along with the new project's sources. > What's the motivation for copying files into a project and then > compiling/linking, vs putting the object files into a static library and > linking your program with that library? IIRC, the linker adds only > referenced symbols to the final executable, so binary bloat doesn't > appear to be a concern in either approach. > >> 3. Finally, https://github.com/cjungmann/th is the thesaurus project. >> It is a console program that is quicker to access and navigate than >> the online thesauri, which compensates somewhat for being less >> organized. > Looks cool from what I can see in the readme! > >> I've tested it on Manjaro and FreeBSD where it successfully compiles >> and runs for me. > Just tried building on Mac and ran into a problem (on commit 4644848d): > > $ make > cc -o th -ldb > ld: library not found for -ldb > clang: error: linker command failed with exit code 1 (use -v to see invocation) > make: *** [th] Error 1 > > I added the dependency with macports: > > sudo port install db62 > > It still didn't build, so I added this to the Makefile: > > LDFLAGS = -L/opt/local/lib/db62 > > However it's still not working: > > $ make > cc -o th -L/opt/local/lib/db62 -ldb > Undefined symbols for architecture x86_64: > "_main", referenced from: > implicit entry/start for main executable > ld: symbol(s) not found for architecture x86_64 > clang: error: linker command failed with exit code 1 (use -v to see invocation) > make: *** [th] Error 1 -------------- next part -------------- An HTML attachment was scrubbed... URL: