aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAryadev Chavali <aryadev@aryadevchavali.com>2025-05-28 23:54:34 +0100
committerAryadev Chavali <aryadev@aryadevchavali.com>2025-05-28 23:54:34 +0100
commitbfff660d0e1a03063b889f84e8cfd046565c6046 (patch)
tree925ef120d7a20bafaed34fcf87dc5bb150ea3e3d
parent12de1e8db90bccd5a0eefd21075f07c7b7e3dfaa (diff)
downloadoats-bfff660d0e1a03063b889f84e8cfd046565c6046.tar.gz
oats-bfff660d0e1a03063b889f84e8cfd046565c6046.tar.bz2
oats-bfff660d0e1a03063b889f84e8cfd046565c6046.zip
Rename tasks.org to oats.org, restructure
-rw-r--r--oats.org (renamed from tasks.org)101
1 files changed, 39 insertions, 62 deletions
diff --git a/tasks.org b/oats.org
index 179e1a2..dbad845 100644
--- a/tasks.org
+++ b/oats.org
@@ -1,7 +1,20 @@
-#+title: Tasks
-#+date: 2025-02-18
-
-* WIP Implement a reader
+#+title: Oats tracker
+#+author: Aryadev Chavali
+#+description: A general tracker for work being done on the project
+#+FILETAGS: :oats:
+
+* Issues :issues:
+** TODO Fix issue with memcpy overlap when string concatenating
+[[file:lisp/lisp.c::// FIXME: Something is going wrong here!]]
+
+Ideas on what's going wrong:
+- String sizes seem off
+- Maybe something is wrong with arena allocator; we use
+ [[file:lib/sv.c::newsv.data = arena_realloc(allocator, sv.data,
+ sv.size, newsv.size);][arena_realloc]] which seems to be the root of
+ the memcpy-overlap
+* Features :features:
+** WIP Reader :reader:
We want something a bit generic: able to handle reading from some
buffer of memory (a string, or contents of a file where we can read
the entire thing at once) or directly from a file stream (STDIN,
@@ -15,19 +28,7 @@ We also want to be able to admit when reading went wrong for some
reason with proper errors messages (i.e. can be read by Emacs) - this
will need to be refactored when we introduce errors within the Lisp
runtime itself.
-** TODO Implement floats and rationals
-Rationals are pretty easy - just two integers (quotient and divisor) -
-so a tagged cons cell would do the job. Floats are a bit more
-difficult since I'd either need to box them or find a creative way of
-sticking IEEE-754 floats into < 64 bits.
-
-Also implement a reader macro for #e<scientific form>. Also deal with
-[-,+,]inf(.0) and [-,+,]nan(.0).
-
-Need to do some reading.
-
-[[file:r7rs-tests.scm::test #t (real? #e1e10)][trigger]]
-** TODO Consider user instantiated reader macros
+*** TODO Consider user instantiated reader macros
We don't have an evaluator so we can't really interpret whatever a
user wants for a reader macro currently, but it would be useful to
think about it now. Currently I have a single function which deals
@@ -39,47 +40,23 @@ consider user environments via the context.
[[file:reader.c::perr_t parse_reader_macro(context_t *ctx, input_t
*inp, lisp_t **ret)][function link]]
-* TODO Consider Lisp runtime errors
-* TODO Admit arbitrarily sized integers
-Currently we admit fixed size integers of 63 bits. They use 2s
-complement due to x86 which means our max and min are 62 bit based.
-
-However, to even try to be a scheme implementation we need to allow
-arbitrarily sized integers. What are the specific tasks we need to
-complete in our model to achieve this?:
-+ Allow "reading" of unfixed size integers
- + This will require reading a sequence of base 10 digits without
- relying on strtold
-+ Implement unfixed size integers into our memory model
- + Certainly, unfixed size integers cannot be carried around like
- fixnums wherein we can embed an integer into the pointer.
- Thus we have to allocate them in memory.
- + NOTE: There will be definitely be an optimisation to be done
- here; integers that are within the bound of a fixnum could be
- left as a fixnum then "elevated" to an integer when needed
- + I think the big idea is allocating them as a fixed set of bytes
- like big symbols. For big integers we have to read the memory
- associated thus we need a pointer. Due to 2s complement it should
- be trivial to increase the size of an integer to fit a new result
- i.e. if I'm adding two integers and that leads to an "overflow"
- where the result is of greater width than its inputs, we should
- just allocate new memory for it.
-
-Consequences:
-- Greater memory use
- - In fact exponential if we need to allocate a whole new integer per
- operation rather than utilising the input memory
-- Possible loss of performance due to making integers over fixnums
- when they don't need to be
-- Comparison is harder on integers
-- Harder to cache for the CPU
-
-but all of this is to be expected when the user is an idiot.
-* TODO Think about how to perform operations on different types
-** TODO Integers
-** TODO Symbols
-** TODO Pairs
-* DONE More efficient memory model for symbols
+*** TODO Parse exponential notation
+We're erroring out here due to not having proper reader notation
+[[file:examples/r7rs-tests.scm::test #t (real? #e1e10)]]
+** TODO Evaluator
+** TODO Runtime errors
+** TODO Better numerics
+We currently admit fixed size integers (63 bits). We _need_ more to
+be a scheme.
+*** Unfixed size integers
+*** Rationals
+*** Floats
+*** Complex numbers
+** TODO Primitive operations
+** TODO Macros
+** TODO Modules
+* Completed :completed:
+** DONE More efficient memory model for symbols
The primitive model for symbol allocation is an 8 byte number
representing the size of the symbol, followed by a variable number of
characters (as bytes). This is stored somewhere in the memory
@@ -123,7 +100,7 @@ need to allocate memory. But, in the worst case of 8 character
symbols, we're only allocating two 64 bit integers: these are easy to
walk on x86 and we've reached at least parity between the memory
required for administration (the size number) and the actual data.
-** Being more aggressive?
+*** Being more aggressive?
Technically, ANSI bytes only need 7 bits. For each of the 7 bytes
used for the character data, we can take one bit off, leaving us with
7 bits to use for an additional character. We don't need to adjust
@@ -148,10 +125,10 @@ to do a lot more work. x86-64 CPUs are much better at walking bytes
than they are walking 7 bit offsets. This may be something to
consider if CPU time is cheaper than allocating 8 byte symbols
somewhere.
-* DONE Tagging scheme based on arena pages
+** DONE Tagging scheme based on arena pages
2025-04-09:21:59:29: We went for option (2) of just taking a byte for
free from the memory address and using it as our management byte.
-** 1) Page-offset schema
+*** 1) Page-offset schema
I've realised arenas are way better than the standard array dynamic I
was going for before. However, we lose the nicer semantics of using
an array index for pointers, where we can implement our own semantics
@@ -213,7 +190,7 @@ will be stable regardless of any further memory management functions
performed on the arena (excluding cleanup) - so once you have a host
pointer, you can use it as much as you want without having to worry
about the pointer becoming invalid in the next second.
-** 2) 48-bit addressing exploit
+*** 2) 48-bit addressing exploit
Most x86 CPUs only use around 48-56 bits for actual memory addresses -
mostly as a result of not needing _nearly_ as many addresses as a full
64 bit word would provide.