--- user-visible fixes * The random number generator's behavior depends on the size of 'unsigned'. It ought to be portable, to make tests reproducible. * Make sure all API functions follow abort vs. exception policy. * Get rid of mn_make_multi_valued_procedure and add a flag to mn_make_procedure: just have one universal constructor function, and then have whatever common-case functions we need (mn_make_multi_valued_procedure is not a common-case function...) * Perhaps all Minor I/O functions, other than byte and byte vector operations, should be wide-oriented operations. It's odd that we can't use mn_write on a port, and then use mn_exception_string and mn_display_str to print an error message. * Should mn_get_utf8 and mn_put_utf8 return a count of code units consumed, instead of a new pointer? Should they take a pointer by reference? (GCC's manual says it will never place a variable in a register if its address is taken.) Try changing the callers first, and see how they look. * Should the arguments to mn_push be reversed? They should be the same as in CL. The list argument should be passed by reference, to more closely resemble CL push. If we're going to name it that. --- utter trivia * Rename globalrefs.[ch] to global-refs.[ch] * Set up process to do nightly check out / make dist / unpack / build { in source tree, in separate tree } --- consistency checks * Why use 'assert' when we have 'check'? Also check for 'if (...) abort'. Just print error message with source location. --- others * Use ISO C 'inline', instead of GCC hair. * numeric conversion functions for size_t, intmax_t, and uintmax_t. * tests for unicode-case.c * Implement better Unicode case insensitivity in reader. * Don't store a terminating null character in strings. More sophisticated representations (shared, quick-concat) won't allow that, and since the strings aren't in the C execution character set anyway, we can't do the trick of passing them directly to system calls. * Abstract out indexing in hash-tables.c. * Strictly-typed mn__tag_symbol, mn__tag_vector. * mn_from_char and mn_from_wchar aren't total, but there's no predicate the user can call to make sure they'll succeed. Granted, Unicode ought to be able to handle whatever's there, but if the C execution character set contains undefined code points, then those can't be converted to Unicode. * The hash table should be its own type, and the symbol table just one application of it. * Hash tables should not rely on incoherent sections for their mutual exclusion. ==== prd.txt item 6 ("implement Core Scheme") finished ==== prd.txt item 7 ("Macro expander") finished * Provide inlined, system-specific definitions for the mn__begin_coherent and mn__end_coherent functions, and see if that speeds things up. * Eliminate uses of mn__per_thread outside pause-posix-tls.c, and maybe trace.c. * Use cancellation, not a thread-specific value destructor. * Exceptions should be structures as in Mzscheme, not strings. ==== prd.txt item 8 ("Full R5RS, with modules and macros") finished * Don't export mn__ functions from libminor.so * Move all forms of basic character execution set to test-lib.c, along with a 'char_name' function; use in c-api-ports, c-api-characters, and c-api-strings. * New test stress-pair.c: build random trees, trade pieces between threads, then replay whole process and check that the results are as we expected. To make inter-thread trades reproducible, have each thread record the sequence of other threads it traded with, and then in the replay wait only for that thread. * For tests that don't actually communicate between threads, add "torture test" that runs them in many threads over and over. Then let the regression tests run single-threaded (and faster!) by default. * Need tests for strings with embedded null characters. * generate per-type test functions in c-api-numbers.c with a shell script * Use linked lists in gc/tests/disjoint-types.c. * Are new ref groups too large? At the moment, we have chosen the size of a reference clump to be small enough that we don't mind allocating one for every call, but large enough that we still reduce our allocation overhead. But those two needs are clearly in tension, and there's no need for them to be. I'll bet small ref groups will be more common than large ones. So we could have variable clump sizes --- start out with a small clump, with room for thirty refs or so, and then each new clump we allocate to the ref would be double the size of the previous clump. That way, the number of clumps would be logarithmic in the number of refs ("The log of n, where n is the size of something, is effectively a constant."), but ref groups with few refs would still be small. * Use fields in call structures in preference to thread-local variable references. TLS references could entail function calls. * Weak references * eqv object sets? Like hash tables, but you can only associate boolean values? Internally, they can be implemented with half the space overhead, and there's no way to do as well (that I can see) without primitives. * Weak eqv hash tables. * The symbol hash should be weak. (Remember that deleting entries from a table that rehashes collisions requires care.) * Guardians * GC should avoid copying large objects. * Once GC avoids copying large objects, use malloc / realloc to manage eqv hash tables; see comments in hash.c. * There should be immutable pairs. Yes, this means you can't use a constant displacement in an addressing mode to access cars and cdrs; you'll really have to mask things off. Get over it. * Labels for built-in types and symbols constructed at initialization should be statically allocated and initialized, and placed in the immortal generation. * Mirage GC map nodes should be statically allocated and initialized. * Should we do a better job at choosing initial block sizes? * Can we get better locality by doing depth-first traversal? * heap dumper, for debugging GC problems * When we actually start generating object files containing heap objects, we'll want to actually fix the values in the tag enums, since they'll be a matter of public interface. * The labels for all the labelled types should be statically allocated and initialized, and placed in the immortal generation. * There should be a "debugging" mode, where freeing a reference marks it as garbage, but the storage itself is never reused, and every API function checks to see if it has been passed a dead reference. Similarly for calls. Ideally, one could switch this on and off without recompiling.