#code.20080309.log

code logs -> 2008 -> Sun, 09 Mar 2008

< code.20080308.log - code.20080310.log >

--- Log opened Sun Mar 09 00:00:04 2008
00:23		GeekSoldier\|bed [~Rob@91.18.86.ns-26604] has quit [Ping Timeout]
00:24		GeekSoldier\|bed [~Rob@91.18.86.ns-26604] has joined #code
00:29		GeekSoldier\|bed [~Rob@91.18.86.ns-26604] has quit [Ping Timeout]
00:29		AnnoDomini [AnnoDomini@83.21.32.ns-4025] has quit [Quit: (...) By this point, the astute reader has picked up that Nethack isn't a "game" as much as an extremely prolonged and extremely elaborate form of masochism. Ask any serious player.]
01:30		Vornicus [~vorn@Admin.Nightstar.Net] has quit [Ping Timeout]
01:31		Vornotron [~vorn@Admin.Nightstar.Net] has joined #code
01:31		You're now known as TheWatcher
04:50		You're now known as TheWatcher[zZzZ]
04:53		Vornotron is now known as Vornicus
04:54		Vornicus is now known as NSGuest-5480
04:55		NSGuest-5480 is now known as Vornicus
05:00		* Reiver finally clicks as to what the hell a tuple really is.
05:00		* Reiver can't believe he'd struggled with the concept, given it is distinctly 'Durrr' stuff. >.<
05:12	< Vornicus>	Heh
05:16		Reiver is now known as ReivShoppin
05:16	<@ReivShoppin>	Seriously!
05:17	<@ReivShoppin>	"A row of data"
05:17	<@ReivShoppin>	It'd had me puzzled in Python for ages >.>
05:27	< Vornicus>	snrk
05:34		Thaqui [~Thaqui@Nightstar-123.jetstream.xtra.co.nz] has joined #code
05:34		mode/#code [+o Thaqui] by ChanServ
06:38		ReivShoppin is now known as Reiver
07:00		GeekSoldier\|bed [~Rob@Nightstar-8762.dip.t-dialin.net] has joined #code
07:03		Vornicus [~vorn@ServicesOp.Nightstar.Net] has quit [Ping Timeout]
07:04		GeekSoldier\|bed is now known as GeekSoldier
07:07		Vornicus [~vorn@Admin.Nightstar.Net] has joined #code
07:07		mode/#code [+o Vornicus] by ChanServ
07:34		AnnoDomini [AnnoDomini@83.21.32.ns-4025] has joined #Code
07:34		mode/#code [+o AnnoDomini] by ChanServ
07:36		Vornicus is now known as Vornicus-Latens
08:05	<@jerith>	Reiver: I always thought of it as "a read-only list".
08:05	<@Reiver>	jerith: Yeah, well, I'd been trying to get my head around the concept.
08:05	<@Reiver>	Now I do, problem solved~
08:07	<@jerith>	:-)
09:00		GeekSoldier [~Rob@Nightstar-8762.dip.t-dialin.net] has quit [Ping Timeout]
09:09		Thaqui [~Thaqui@Nightstar-123.jetstream.xtra.co.nz] has left #code [Leaving]
09:41		GeekSoldier [~Rob@Nightstar-9089.dip.t-dialin.net] has joined #code
09:43		gnolam [lenin@Nightstar-10613.8.5.253.static.se.wasadata.net] has joined #Code
09:43		mode/#code [+o gnolam] by ChanServ
09:54		Brother_Willibald [lenin@Nightstar-10613.8.5.253.static.se.wasadata.net] has joined #Code
09:55		Brother_Willibald [lenin@Nightstar-10613.8.5.253.static.se.wasadata.net] has quit [Quit: poof]
10:20		You're now known as TheWatcher
11:30		AnnoDomini [AnnoDomini@83.21.32.ns-4025] has quit [Ping Timeout]
11:31		AnnoDomini [AnnoDomini@83.21.28.ns-26444] has joined #Code
11:31		mode/#code [+o AnnoDomini] by ChanServ
12:29		eXeLaNCe [~dddd@88.245.15.ns-13237] has joined #code
13:03		eXeLaNCe [~dddd@88.245.15.ns-13237] has quit [Quit: ]
13:08	< Moltare>	Idly, lads, I'm looking for the regex that means "Anything, including spaces, tabs and newlines, that comes between /* and */"
13:08	< Moltare>	I thought it was "/"[. \t\n]"*/" , but that doesn't seem to cut the mustard
13:08	<@McMartin>	It doesn't because it's allowing */s in the middle of it.
13:09	<@McMartin>	That said
13:09	<@McMartin>	I seem to recall that Tiger allows nested comments, so you're going to have to be more cunning about this
13:09	< Moltare>	Tiger doesn't
13:09	< Moltare>	Or, wate
13:09	< Moltare>	Tiger does, but I'm not trying to build a lexer for Tiger
13:10	< Moltare>	It's for a dodgy homebrew that our lecturer made up
13:13	<@McMartin>	Ah, OK
13:13	<@McMartin>	The problem you've hit is that [. \t\n]* means "as many of any character you can munch"
13:13	<@McMartin>	That's the entire file
13:13	<@McMartin>	You need to exclude the "*/" sequence from that middle bit.
13:14	< Moltare>	Well, my /current/ problem is that it doesn't find a "/*" at all; it finds a division operator followed by a multiplication operator
13:14	<@McMartin>	Aha.
13:14	< Moltare>	Despite the quotes
13:14	<@McMartin>	You need to put the comment-matcher higher up in the flex file so it will have higher priority.
13:15	< Moltare>	It was already above the others
13:15		* Moltare puts it at the top instead
13:15	<@McMartin>	Hrm.
13:15	<@McMartin>	Maybe it then needs to be at the bottom~
13:22		* Moltare fiddle
13:24	< Moltare>	So, using "/"[a-zA-Z0-9 \t\n]"*/" to limit it to alphanumeric characters for the moment
13:25	< Moltare>	It recognises /* / as a comment, but not / comment */
13:26		* GeekSoldier tries to remember... "/blah/s"?
13:28	<@McMartin>	GeekSoldier: This is flex, not Perl
13:28	<@McMartin>	Moltare: OK, that boggles me
13:29	< GeekSoldier>	oh.
13:29		* GeekSoldier returns to his corner.
13:30	<@McMartin>	Maybe it needs spaces between the "s and the []s?
13:32	< Moltare>	no appreciable difference
13:34	<@McMartin>	Does the space need to be escaped?
13:34	< Moltare>	It doesn't for the "ignore spaces, tabs and newlines" entry
13:34	<@McMartin>	Blarghlecopter.
13:35	<@McMartin>	What does it think of /c/?
13:36	<@McMartin>	I'm wondering if it's somehow getting boggled by more than one character or something
13:37	< Moltare>	Breaks in the exact same way
13:37	< Moltare>	/*/ and / / are fine, /c/ and / c */ are not
13:37	<@McMartin>	How about /* */?
13:38	< Moltare>	Fine
13:39	<@McMartin>	/* 123 */?
13:39	< Moltare>	Not
13:40	<@McMartin>	This is several varieties of aggravating
13:40	<@McMartin>	The TeXInfo implies that it should work
13:40	<@McMartin>	What happens if you remove the 0-9 and try your test cases?
13:41	< Moltare>	Same
13:41	<@McMartin>	In case this is some bizarre heinousness where - works for letters but not numbers
13:41	<@McMartin>	Rargh.
13:41	<@McMartin>	OK
13:41		* Moltare ponders, deletes everything but the .l file, recompiles from scratch just in case
13:43	< Moltare>	Ah, now it won't compile. This indicates progress of a backwards sort
13:43	<@McMartin>	Gnrk.
13:43	<@McMartin>	What's the error?
13:44	< Moltare>	FXD
13:44	< Moltare>	Spaces where they shouldn't be
13:45	< Moltare>	/*/: works / /: works /c/: works / c */: works
13:45	<@McMartin>	... OK.
13:45	<@McMartin>	And now /* 123 */ won't.
13:45	< Moltare>	And the compiler was reusing an old version of something
13:46	< Moltare>	I put that back in, McM
13:46	<@McMartin>	Aha.
13:46	<@McMartin>	Good times then
13:47	< Moltare>	Now I need to change it from "alphanumeric characters" to "anything that isn't */", I take it
13:48	<@McMartin>	Right.
13:48	<@ToxicFrog>	Yes.
13:48	<@McMartin>	But, of course, /********/ needs to be legal.
13:48	<@McMartin>	So you can't just do [^]
13:50	< Moltare>	What about [^"/"]? Or will that just compare every character to */ and therefore never fire?
13:51	<@ToxicFrog>	That is "everything but ", *, and /", except that since flex uses " as well it probably won't work period
13:51	< Moltare>	ah
13:52	<@ToxicFrog>	I'm not entirely sure it's possible to handle /* comments */ using just regexes
13:52	<@McMartin>	It is.
13:52	<@McMartin>	You have to be a rat bastard about it, but it's doable.
13:52	<@McMartin>	What isn't doable with raw regex is nested comments.
13:53	<@McMartin>	Flex can do it by abusing state variables to make it limited-context-free, but.
13:53	< Moltare>	My rat bastardry skills are weak, as you may have noted ��
13:54	<@McMartin>	Basically, * is allowed as long as it isn't followed by a slash.
13:55	<@McMartin>	And you know how to say "something that isn't a slash"
13:57	< Moltare>	So it's ((^\|^/)\|(^/))* ?
13:57	< Moltare>	Not an asterisk or slash, or asterisk as long as a slash doesn't follow
13:58	<@McMartin>	That doesn't look remotely like flex syntax
13:58	< Moltare>	I used the wrong shape brackets there ��
13:59	<@McMartin>	^ outside of the square brackets means "match the beginning of a line"
13:59	<@McMartin>	Also, ////////// is an acceptable comment.
14:01	< Moltare>	[[^\|^/]\|[^/]\|[^/]] ? That just complains at me ��
14:04	<@McMartin>	Yeah, you can't nest []s.
14:05	< Moltare>	How irritating.
14:05	<@McMartin>	[] is a special case in its own right.
14:05	<@McMartin>	[^\|^/] is not an OR.
14:05	<@McMartin>	If you want "neither * nor /" that's [^*/].
14:05	< Moltare>	Is that not "not *, followed by /"?
14:06	<@McMartin>	No.
14:06	<@McMartin>	Because [^*/] matches a single character.
14:06	<@McMartin>	Specifically, any character that is not * or /.
14:06	<@McMartin>	It's equivalent to [^/*].
14:06	< Moltare>	Alright, then
14:08	< Moltare>	So how do I create ( /* followed by (either not * and not /, or * that isn't followed by /, or / that isn't preceded by ) an arbitrary number of times followed by / ), then?
14:08	< Moltare>	(The lex manual I have here claims that \| is an OR, idly)
14:08	<@McMartin>	Well, "/*" followed by (something) is easy.
14:08	<@McMartin>	Yes, \| is indeed or.
14:09	<@McMartin>	However, when part of a [] token, \| is "the vertical bar character"
14:09	<@McMartin>	Also, your last bit of the spec is wonky.
14:09	<@McMartin>	/* /* */ is a valid comment.
14:09	<@McMartin>	/* /* / / is a valid comment followed by a * and a /.
14:10	<@McMartin>	All that said
14:11	< Moltare>	/* /* / fits, surely? It's /, followed by / that is preceded by a space, followed by * that is followed by a space, followed by */
14:11	<@McMartin>	You can use () to group stuff up
14:11	<@McMartin>	Oh, I see, I missed the "preceded"
14:11	<@McMartin>	Try a rephrase.
14:12	<@McMartin>	"Either a single character that isn't , or a followed by something that isn't /..."
14:13	< Moltare>	"/"(^\|^/)"*/" is what I'd got it down to
14:13	<@McMartin>	Close.
14:13	<@McMartin>	You're missing some []s in strategic locations.
14:13	<@McMartin>	And possibly some ""s.
14:14	< Moltare>	�� Much as I appreciate the help, I begin to see why asking for it drives Jaci insane. I'm not looking to learn, here, I just want the damn thing working so I can put it in my past :P
14:14		* Moltare fiddle some more, then
14:15	<@McMartin>	Moltare: And I've TAed this very class twice, and so I am deliberately nerfing myself, acting as if you were somebody wandering into my office hours.
14:15	<@McMartin>	I rather suspect this isn't going to help the attitude problems much.
14:15	< Moltare>	heh
14:16	< Moltare>	Victory, all the same
14:16	<@McMartin>	Good show
14:17	<@McMartin>	I'm afraid I can't be a lot of help with a C-based recursive descent parser, though I can point you at ones that I wrote in Java and OCaml. The principles should be similar. =P
14:19		* Moltare applies hard-won knowledge, fixes his string definition into the bargain
14:21	< Moltare>	Ahh.. or not. Because a string "foo" currently reports as a string with value "foo" rather than a string with value foo...
14:21		* Moltare attempts to solo this one, first
14:23		* McMartin goes to deal with breakfast
14:35	< Moltare>	Doesn't help that every time I go to write 'lexer' I write 'lever'
14:35	< Moltare>	Did it that time too
14:44	< Moltare>	Lunch!
15:06		* gnolam snerks.
15:06	<@gnolam>	http://www.imdb.com/name/nm2469945/
15:09	<@McMartin>	Hmm, and because I seem to have neglected to quote it in here:
15:09	<@McMartin>	"The defense grid can be full of lasery doom. The defense grid is not full of lasery doom."
15:12		* Moltare replaces his printfs with something more useful to the nascent parser
15:13	< Moltare>	Understanding check, plz?
15:13	< Moltare>	A rule should return a T_SOMEKINDOFTOKEN and possibly an associated yylval
15:14	<@McMartin>	Urgh. I haven't used flex proper in long enough to be able to answer that with confidence.
15:14	< Moltare>	There is also a .h file, whatever one of those is, that lists structures of T_ALLTHETOKENS
15:14	<@McMartin>	That sounds about right.
15:14	< Moltare>	ie type, value
15:14	<@McMartin>	Yeah
15:15	< Moltare>	Then the parser itself calls the yyparse() thing generated in yy.lex.c by flex, breaks it into left,right&centre and recurses it in the face
15:16	< Moltare>	erm, yylex() thing
15:16	<@McMartin>	I don't recall if the actual token return ends up in a global too or not
15:16	< Moltare>	and a hash table is involved to check if variables are present or not
15:17	<@McMartin>	Well, that's your doing, not flex's.
15:17	< Moltare>	The hash table? yes
15:17	< Moltare>	I've got a fragment of code here: struct token { char *lexeme; int type; int value; }, but no idea what I'm supposed to be doing with it ��
15:19	<@McMartin>	At this point you're in "what your assignment is" territory and we're unlikely to be a lot of help.
15:19	<@McMartin>	(By which I mean "involving the spec of the assignment", not "we aren't going to do your homework for you")
15:21	< Moltare>	As far as I've worked it out: parser.c has the parse() method which does the actual parsing, and #includes a tokens.h file and a lex.yy.c file.
15:21	< Moltare>	lex.yy.c is what flex creates.
15:22	<@McMartin>	Right.
15:22	< Moltare>	tokens.h defines the token structure globally as having a type and possibly a value, and lists the type for a given token
15:22	<@McMartin>	And presumably, right now your parse() is just reading the stream and dumping it?
15:22	< Moltare>	(as a big column of #define T_COMMA 125; etc
15:22	<@McMartin>	Right
15:23	< Moltare>	Right now I have no parse(), as I've only just got the lexer putting out stuff on command
15:23	<@McMartin>	OK.
15:23	< Moltare>	That, I think, is step 1
15:23	<@McMartin>	Aye.
15:23	< Moltare>	Get the tokens.h file and make it play nicely with a basic parse() method
15:23	<@McMartin>	So, parse() is going to be taking the output of the lexer as a stream of tokens, and turning it into some kind of (probably tree-recursive) structure.
15:24	< Moltare>	recursive-descent, as specified
15:24	< Moltare>	(in my spec, that is, not 'as I have already mentioned')
15:24	<@McMartin>	Well, that's the parser's implementation
15:24	<@McMartin>	By "tree-recursive" I mean that you're producing a list of Expressions or whatnot
15:24	< Moltare>	Oh.
15:24	< Moltare>	Yes.
15:24	<@McMartin>	And Expressions themselves can be made of expressions.
15:25	<@McMartin>	Have you done anything with unions in C?
15:25	< Moltare>	No, and I note that your use of "with unions" is superfluous.
15:25	< Moltare>	I have never touched C before this assignment. ��
15:26	<@McMartin>	OK, so.
15:26	<@McMartin>	A union is sort of like a struct, except that all of the members overlap.
15:26	<@McMartin>	This lets you do horrifically awful things to memory, much like everythign else in C.
15:26	<@McMartin>	More to the point, it's a way to get Polymorphism.
15:26	<@McMartin>	You go, say:
15:26	<@McMartin>	struct TOKEN {
15:26	<@McMartin>	int tag;
15:26	<@McMartin>	union {
15:26	<@McMartin>	char * stringval;
15:26	<@McMartin>	int intval
15:26	<@McMartin>	} value;
15:27	<@McMartin>	};
15:27	< Moltare>	OH, right
15:27	< Moltare>	So it can be either
15:27	< Moltare>	(what does the * represent?)
15:27	<@McMartin>	And then if you access the wrong value of the union you corrupt memory and possibly bring down the entire machine
15:27	<@McMartin>	"address of previous type"
15:27	<@McMartin>	C has no concept of strings.
15:28	<@McMartin>	Instead, you use an address of a character, and hope and pray that there is a null byte at an appropriate point in the future.
15:28	<@gnolam>	Eh, the real usefulness of unions lies in serialization.
15:28	<@gnolam>	IMO.
15:28	< Moltare>	The more I hear of C, the more I wonder why everyone hates Java so much �� it seems to be far more intent on exsanguinating you
15:28	<@McMartin>	As an ML partisan, I beg to differ. They're for implementing Constructor types.
15:29	<@McMartin>	Moltare: C partisans feel that Java's inability to completely fuck you over for the tiniest mistake is an unconscionable assault on their freedom as a programmer.
15:29	<@ToxicFrog>	Moltare: C, unlike Java, is useful for implementing kernels, device drivers, and other low-level-but-we-don't-want-to-write-this-in-asm stuff.
15:29	<@McMartin>	And yes, said freedom is actually necessary for direct hardware control.
15:30	<@ToxicFrog>	The same features that make it useful for that also make it insanely dangerous, though~
15:30	<@McMartin>	That said, when your professor said that this would be vastly easier in C, he was lying through his teeth. I suspect his actual intent was to make you actually implement stuff on your own instead of just handing it over to library classes.
15:30	<@McMartin>	Like, you know, String.
15:30	<@McMartin>	And HashMap.
15:31	<@ToxicFrog>	Quite.
15:32	< Moltare>	So, having created our token structure and given the appropriate #define T_COMMA someintegervalue in the token.h file, I then get the lexer to return T_COMMA when it hits "," in the program you hand it
15:32	< Moltare>	"," { return T_COMMA; } sort of thing
15:33	<@McMartin>	(Also, less hostilely, because the Java version of the Tiger book uses a totally different technique than the C/ML version, revolving around Visitors)
15:33	<@McMartin>	Mol: That sounds about right, yes.
15:33	<@McMartin>	IIRC, calls to yylex() will assign some global structure that will let you get the juicy datameats out once this is done
15:33	<@ToxicFrog>	Although generally T_COMMA would be part of an enum, rather than a straight #define.
15:33	<@McMartin>	TF: It's flex. It does its own thing.
15:34	< Moltare>	And when it's a variable I get the lexer to assign yytext to stringval and then return T_VAR?
15:34	<@McMartin>	Right.
15:34	<@ToxicFrog>	McMartin: no, you need to provide them yourself.
15:34	< Moltare>	{ID} { stringval = yytext; return T_VAR; }
15:34	<@McMartin>	And then parse() needs to know that a T_VAR means you need to read stringval.
15:34	<@ToxicFrog>	That's what y.tab.h is for, but if you aren't using yacc, that doesn't get generated.
15:34	<@McMartin>	Aha
15:34		* McMartin has never used flex alone, so.
15:35	<@ToxicFrog>	So instead you need to write your own (say) tokens.h, and #include it in your lexer and parser
15:35	<@McMartin>	Aha.
15:35	< Moltare>	TF: Which is why I need to write the token.h file and #
15:35	< Moltare>	right
15:35	<@McMartin>	Anyway, he's right.
15:35	<@McMartin>	Instead of #define T_COMMA etc.
15:35	<@ToxicFrog>	And the contents are something like: enum Tokens { T_COMMA, T_SEMICOLON, T_OPENPAREN, T_STRING, T_INT, ..., T_NUMTOKENTYPES }
15:35	< Moltare>	I note I've never heard of enum; what's the distinction?
15:36	<@ToxicFrog>	Enum creates symbols rather than macros.
15:36	<@ToxicFrog>	#define is basically a global search-and-replace.
15:36	<@ToxicFrog>	Enum creates what are, in effect, constants with automatically assigned values.
15:39	< Moltare>	I thought global search-and-replace was what I was doing here
15:39	<@ToxicFrog>	...
15:39	< Moltare>	"When you see T_COMMA, read it as 145 and give that to the "type" variable"
15:40	< Moltare>	Or am I totally lost again?
15:40	<@ToxicFrog>	It is what you are doing with #define, yes
15:40	<@McMartin>	See, the idea here is that it's better to say "read it as something unique, I don't care what"
15:40	<@ToxicFrog>	As a general rule, though, you don't want to do that if you don't have to; and using an enum guarantees that all the values are unique without you having to worry about that, too.
15:44	< Moltare>	Fair enough; but then what goes in "tag" in McM's example struct above? since we're not giving them integer tag values
15:45	<@McMartin>	enums are secretly integers
15:45	<@McMartin>	What enum does is abstract out what the actual value is
15:46	<@McMartin>	(Also, "tag" in this case is actually what yylex() is returning)
15:46	< Moltare>	um
15:46	<@McMartin>	(when you return T_COMMA or what not, that value is assignable to an int variable)
15:47	< Moltare>	And flex knows to drop the value of T_COMMA into tag automatically?
15:48	<@McMartin>	I don't believe so, now that you mention it - I defer to TF on how the API actually works.
15:48	<@McMartin>	I used it merely as an example.
15:48	<@McMartin>	You'd have some other enum for expression types - and that's where tags would go and such
15:49	<@McMartin>	You'd just be reading return values and the global yytext value when interpreting lexemes.
15:49	<@ToxicFrog>	Alternately, have flex construct and return the token struct
15:49	<@McMartin>	Oh god, the memory management hassles =(
15:49	<@McMartin>	It's going to be bad enough with the AST.
15:50	<@ToxicFrog>	[0-9]+ { Token * tok = new_token(); tok.type = T_INTEGER; tok.value.intval = atol(yytext); return tok; }
15:51	<@ToxicFrog>	IME, this makes the code more clear while making memory management slightly trickier.
15:51	<@ToxicFrog>	But not hugely trickier; it's just callee-allocates, caller-frees.
15:52	<@McMartin>	With added ugliness if you need to duplicate yytext's values; you'll need to ensure that either all strvals are safe to free, or that none of them need to be.
15:52	< Moltare>	I note the lecturer specifically stated "don't bother freeing memory, it's not worth it for this"
15:52	<@ToxicFrog>	But yes. Flex doesn't know anything about what structures you're using, or tags, or anything.
15:52	<@ToxicFrog>	When it gets a match, it sets yytext to the actual text that matched, executes the corresponding code, and that's it.
15:53	<@McMartin>	Is yytext a global or an argument of some kind?
15:53	<@ToxicFrog>	Global. extern const char * yytext, IIRC.
15:56	< Moltare>	So if I do it that way, I don't have to play around with token.h? just define the token struct at the top of the lexer and have it return tokens which potentially have values added?
15:56	<@ToxicFrog>	You still need tokenh
15:56	<@ToxicFrog>	Otherwise, how does the parser tell what kind of token it is?
15:56	<@McMartin>	Otherwise the value T_INTEGER or whatnot won't exist.
15:57	< Moltare>	Right, but it'd just be a list of "This is T_INTEGER; it has an intval in it. This is T_COMMA; it has nothing in it. This is T_..."
15:58	<@McMartin>	Yeah.
15:58	<@McMartin>	And actually, the "it has a FOO in it" can be implicit.
15:58	<@ToxicFrog>	Er
15:58	<@ToxicFrog>	?
15:58	<@McMartin>	As long as it's unique, and parse() only ever reads the right value, Life Is Good.
15:58	<@ToxicFrog>	Woudn't it be a list of enums and a -single- struct-union definition?
15:59	<@McMartin>	Well, if you're making a Universal Token Type.
15:59	<@McMartin>	I'm imagining a case where you're communicating solely through a stream of globals and return values a la yytext.
15:59	< Moltare>	If the struct-union definition is in lexer.l's definitions, would you need it in token.h too?
15:59	<@McMartin>	It would probably be better to have it only be in token.h, unless there's some bizarre part of lexer.l I'm not grokking
16:00	<@ToxicFrog>	Moltare: it would be -only- in token.h
16:00	<@ToxicFrog>	Which is then #included by both the lexer and the parser
16:00	< Moltare>	oh, right
16:00	<@ToxicFrog>	Thus, they get the same definition for the token types, and for the layout of a Token struct, and they agree on everything
16:03	<@ToxicFrog>	http://lua.pastey.net/83554 -- a very simple example which assumes tokens only need to worry about int or string values (or no values)
16:03	<@ToxicFrog>	So, .tag is set to T_<something>, so that you can tell what kind of token a given Token struct is.
16:03	<@ToxicFrog>	And if that type has an associated value (say, T_INTEGER or T_STRING), the corresponding .value.<type>val is filled in.
16:04	< Moltare>	Makes sense
16:04	<@ToxicFrog>	So, the lexer can create and populate a Token struct appropriately for each token; and the parser can then look at that struct and figure out what kind it is and what value, if any, it has.
16:04	<@ToxicFrog>	(and then based on that the parser does the actual parsing thing)
16:05	< Moltare>	And do I need to define new_token() somewhere?
16:05	<@McMartin>	Yeah. That's essentially a one-liner
16:05	<@McMartin>	return malloc (sizeof (Token));
16:06	<@McMartin>	"Give me a chunk of uninitalized memory of this size"
16:06	<@McMartin>	If you want it to be zeroed by default, use calloc
16:06	<@ToxicFrog>	Token * new_token() { return malloc(sizeof(Token)); } /* create enough memory to hold a Token and return a pointer to it */
16:06	< Moltare>	Oh, right, the whole 'manage your own memory' thing
16:06	<@McMartin>	Any malloc()ed memory will need to be manually free()ed when you're done with it.
16:06	<@McMartin>	And for God's sake, only ever free() it once, and don't access it after it's been free()ed.
16:06	< Moltare>	It doesn't need free()ing, sez lecturer
16:07	<@ToxicFrog>	In this case you probably don't need to worry about that, because it's a short-running program and the OS will free it all when it exits.
16:07	<@ToxicFrog>	Which is what the lecturer is saying.
16:07	<@ToxicFrog>	(sidenote: if you have "Token * foo", you access its internals with "foo->tag", rather than "Token foo" and "foo.tag")
16:08	<@ToxicFrog>	(and since you are now playing with memory management and pointers, you will indeed have Token * foo)
16:10		Vornicus-Latens [~vorn@Admin.Nightstar.Net] has quit [Ping Timeout]
16:15	< Moltare>	C doesn't natively do scientific notation, does it?
16:16	<@ToxicFrog>	Yes it does.
16:16	< Moltare>	Ah? handy
16:16	<@ToxicFrog>	double foo = 1.0e+06; /* compiles! */
16:16	< Moltare>	I tried to look it up but found no references that weren't C++ or C#
16:17	<@McMartin>	Is atof() smart enough to read those?
16:17	<@ToxicFrog>	Yes.
16:17	<@ToxicFrog>	Well, strtod() is
16:17	<@ToxicFrog>	And the atof man page says the behaviour is "identical to strtod except it does not report errors"
16:18	<@McMartin>	Ah yes, another grand C tradition.
16:18	<@McMartin>	In other news, gets() is still required by all conforming runtimes.
16:18		* ToxicFrog heads off to campus. Later!
16:18		* McMartin goes to perform his ablutions.
16:19	< Moltare>	hooray for stuff!
16:20	< Moltare>	(and thanks for your patience)
16:20	<@McMartin>	(flex ends up being an actually useful tool in its own right, for all kinds of stuff)
16:20	<@McMartin>	(Granted, in nearly all of these cases you should for the love of God not be writing it in C)
16:25		gnolam [lenin@Nightstar-10613.8.5.253.static.se.wasadata.net] has quit [Ping Timeout]
16:26		gnolam [lenin@85.8.5.ns-20483] has joined #Code
16:26		mode/#code [+o gnolam] by ChanServ
16:30	< Moltare>	31 errors, woo
16:30	< Moltare>	Although 28 of them appear to be identical
16:30	<@McMartin>	Forgotten commas?
16:32	< Moltare>	No, it's complaining about enum token { T_BLAH... } and struct token { stuff here }
16:33	< Moltare>	Conflicting types, previous declarations, and something about not being able to return voids which I think might be a result of the first one. Also lots of whining about token which is clearly relating to the first issue
16:33	<@McMartin>	Ah, yes.
16:34	<@McMartin>	(I think you want typedef enum { T_BLAH... } TOKEN; and typedef struct token_struct { ... } token; )
16:37	< Moltare>	yay, different errors
16:37	< Moltare>	I don't even understand what the first one is saying, this time ��
16:37	< Moltare>	Says, "In function `struct token * new_token()':"
16:38	< Moltare>	Or is that just "All of the following errors are here" or similar?
16:39	<@McMartin>	Yeah, that's "look here for what's going on"
16:39	< Moltare>	Then there's one "ANSI C++ forbids implicit conversion from `void ' in return", and a shitload of "request for member `type' in `tok', which is of non-aggregate type `token '" and "return to `int' from `token *' lacks a cast
16:39	< Moltare>	"
16:40	<@McMartin>	OK, the shitload is of ou using .type instead of ->type
16:41	<@McMartin>	The ANSI C++ thing can DIAF and should only be a warning anyway, since you aren't writing C++...
16:41	<@McMartin>	If you want to get rid of it, make it return (token *)malloc (etc)
16:42	< Moltare>	ta
16:42	<@McMartin>	The "return to int from token *' lacks a cast" implies to me that your function declaration either forgot to declare a return type, or the people calling it don't know about its types
16:42	<@McMartin>	If the former, declare new_token as "token * new_token(void)"
16:43	<@McMartin>	If the latter, add the prototype "token *new_token(void);" - with the semicolon - to token.h
16:43	<@McMartin>	After the definition of the type
16:43	< Moltare>	(Is it token->value.intval or token->value->intval, idly?)
16:44	<@McMartin>	(token->value.intval, as value is not a pointer)
16:44	< Moltare>	(excellent, got something right)
16:47	< Moltare>	(with the result that I've only got the 'lacks a cast' ones left fiddlefiddle)
16:47	<@McMartin>	C assumes that any function it's never heard of returns an int and takes any number of untyped arguments
16:47	<@McMartin>	This is Always A Horrifically Bad Idea, so you need to type-declare them first in the header files.
16:49	< Moltare>	how random. Why int?
16:49	<@McMartin>	Because C's predecessor language only had two types; int and int*.
16:50	< Moltare>	Isn't that a bit... limiting?
16:50	<@McMartin>	With "int" defined as "whatever size you can shove into the hardware's register"
16:50	<@McMartin>	This would have been the late 60s/early 70s.
16:50	<@McMartin>	The idea that you could write "x+y*z" and have it work out operator precedence and assign temporary registers and stuff was still Hot Shit.
16:51	<@McMartin>	Though not brand new, the way it was for FORTRAN.
16:51	<@McMartin>	So called because it was a FORmula TRANSlator, and thus astonishing and new
16:53	< Moltare>	hm. Adding the prototype breaks the "forbids implicit conversion" thing again. And then removing it doesn't unbreak it.
16:53	< Moltare>	Za.
16:53	<@McMartin>	OK, the prototype should be there anyway
16:54	<@McMartin>	Where's the "implicit conversion" error?
16:54	<@McMartin>	And what's the line that produces it?
16:54	< Moltare>	I'd tell you, but it's gone again
16:55	<@McMartin>	Hmm.
16:55	< Moltare>	Also Dev-C++ is now telling me it's out of memory, and not letting me close it
16:55	<@McMartin>	whut
16:56		* Moltare process-kills it, starts it up again, shrugs
16:58	< Moltare>	Right, now it's back to only doing the lacks-a-cast thing
16:58	<@McMartin>	What line produces it?
16:58	< Moltare>	Any line of the .l file that attempts to return a token
16:59	< Moltare>	"if" { token * tok = new_token(); tok->type = T_IF; return tok; } and its ilk
16:59	< Moltare>	(hence 26 of the original 31 errors being identical)
16:59	<@McMartin>	Aha
16:59	<@McMartin>	This sounds like yylex() isn't being prototyped.
16:59	<@McMartin>	Maybe add a token *yylex(void); to token.h too?
16:59	<@McMartin>	I'm stabbing in the dark here
17:00	<@McMartin>	If yylex() has a forced prototype, then you're kind of screwed
17:01	< Moltare>	15 tokens.h
17:01	< Moltare>	ambiguates old declaration `struct token * yylex()'
17:01	< Moltare>	New and shiny extra error from that
17:01	<@McMartin>	Where was the old declaration?
17:01	< Moltare>	I never made the old declaration
17:01	< Moltare>	PResumably flex did it for me
17:02	<@McMartin>	Hm. Somewhere in lexer.l you've defined "struct token *"
17:02	<@McMartin>	Turn that to just "token *" if you did the typedef
17:03	< Moltare>	I've not defined "struct token *" anywhere
17:03	<@McMartin>	Hum
17:03	<@McMartin>	Can you paste lexer.l somewhere?
17:04	< Moltare>	Certainly
17:05	< Moltare>	Can't send the url
17:05	< Moltare>	oh, wate
17:05	< Moltare>	no voice :P
17:05	< Moltare>	(pm'd)
17:13		You're now known as TheWatcher[afk]
17:19		mode/#code [+v Moltare] by ChanServ
17:29	<+Moltare>	Hashtable next, then!
17:29	<+Moltare>	- stores variables
17:29	<+Moltare>	- has place(thing) and find(thing)
17:30	<+Moltare>	if find(thing) fails, uses place(thing)
17:30	<+Moltare>	- is basically just like I'd do it in Java?
17:30	<+Moltare>	oh, and
17:30	<+Moltare>	- goes in the parser .c file?
17:40		* Moltare makes a shoddy first draft, leaves it for now
17:41	< C_tiger>	Mol: if you haven't got the regex: \/\(.?)\*\/ will work for you
17:41	<+Moltare>	I got it, but thank you all the same :)
17:42	< C_tiger>	Yeah, there was a little too much upscroll to read.
17:43	< C_tiger>	But that's literal /* (any character as many times as needed but MINIMAL number of times so the rest of the regex fits) literal */
17:43	< C_tiger>	parentheses unnecessary.
17:51		Vornotron [~vorn@Admin.Nightstar.Net] has joined #code
17:57	<@McMartin>	Also, for the record, from the PM discussion
17:58	<@McMartin>	If you're using flex and you want to return something that isn't an integer, you have to #define YYDECL to be the (semicolon-free) prototype for your parser function.
17:59	<@McMartin>	Otherwise you'll get type conflicts, which are The Lose
18:39		Vornotron [~vorn@Admin.Nightstar.Net] has quit [Ping Timeout]
18:46	<@AnnoDomini>	Hm. Would anyone know where I could find an implementation of the Bresenham algorithm in assembly? Preferrably x86, but most anything will do.
18:47		Vornotron [~vorn@Admin.Nightstar.Net] has joined #code
18:59		You're now known as TheWatcher
19:04		* AnnoDomini will try converting the pseudocode from Wikipedia, then.
19:06	< Vornotron>	What's the subject?
19:06	<@AnnoDomini>	Bresenham line algorithm in assembly.
19:06	< Vornotron>	Aha
19:08	<@AnnoDomini>	We're supposedly given a pseudocode for it in the materials for the class, but it looks to be the basic version, which won't help me.
19:08	<@AnnoDomini>	And it doesn't work, either.
19:17	<@McMartin>	I have C code for it in my first edition Graphics Gems book.
19:18	<@McMartin>	I believe I used it to create an assembler Bresenham's for the C64.
19:18	< Vornotron>	Bresenham is pretty easy, in the end.
19:18	<@McMartin>	Which is the wrong chip. =P
19:18	<@McMartin>	Yes.
19:18	<@McMartin>	SDL_gfx also has an implementation of it, I believe.
20:02		Vornotron is now known as Finerty
20:08	<@MyCatVerbs>	McMartin: type conflicts in _C_ of all language are Double Lose, with Extra Lose on the Side.
20:11		Attilla [~The.Attil@194.72.70.ns-11849] has quit [Quit: <Insert Humorous and/or serious exit message here>]
20:14	<@ToxicFrog>	Moltare: concerning the hash table: typically this would go in a seperate file, say, hash.c
20:15	<+Moltare>	Oh, which I then #include in parser.c?
20:15	<@ToxicFrog>	It gets a corresponding header, hash.h, which other files that use it #include (and contains function declarations and suchlike)
20:15	<@ToxicFrog>	No, you #include the hash.h
20:15	<@ToxicFrog>	The actual function _code_ goes in hash.c
20:15	<@ToxicFrog>	Which then gets combined with the rest of the program at link time.
20:16	<@ToxicFrog>	(also: it occurs to me that if you're using Dev-C++, the "ANSI C++ forbids..." warnings might be because it's trying to compile it as C++; double check your project settings)
20:17		* gnolam ponders launching into one of his anti-Dev-C++ rants again.
20:17		* GeekSoldier gets the popcorn.
20:17	<@ToxicFrog>	gnolam: hold it until after Mol is done with his homework, please?
20:18	<@ToxicFrog>	And for the record, I suggested just using gcc/MSYS directly, since this is a small project.
20:18	<@ToxicFrog>	Moltare: anyways. The idea is that a .c file holds actual code. A .h file holds struct, enum, and function declarations, #defines, and whatnot - everything that other .c files need in order to make use of that code.
20:19	<@ToxicFrog>	At build time, all the .c become .o (object code), and those are all combined into your executable or library.
20:22	<@gnolam>	McMartin: you wouldn't happen to have code for Bresenham ellipses in there somewhere as well?
20:22	<@McMartin>	(The key difference here is that C, unlike Java or Python, compiles each code file in a vaccuum)
20:22	<@McMartin>	gnolam: I have no idea; the book is buried somewhere in my closet
20:27		Moltare [~moltare@Nightstar-29340.cable.ubr02.bath.blueyonder.co.uk] has quit [Ping Timeout]
20:27		Moltare [~moltare@Nightstar-29340.cable.ubr02.bath.blueyonder.co.uk] has joined #code
20:32		Attilla [~The.Attil@194.72.70.ns-11849] has joined #code
20:33		Moltare [~moltare@Nightstar-29340.cable.ubr02.bath.blueyonder.co.uk] has quit [Ping Timeout]
20:40		Moltare [~moltare@82.32.73.ns-25785] has joined #code
20:42	<@ToxicFrog>	Wibs.
20:42	<@AnnoDomini>	Question - what flag signifies the presence of a negative number in the x86 architecture?
20:43	<@AnnoDomini>	SF?
20:43	<@AnnoDomini>	Can't seem to find anything on x86 flags.
20:45	<@ToxicFrog>	Yes, it's SF
20:46	<@ToxicFrog>	Stands for "Sign Flag"
21:52	<@AnnoDomini>	Damn it. Conditional statements are such a pain in assembly. :/
21:54	<@ToxicFrog>	It's just branches.
21:54	<@McMartin>	GOTO, M-Fer! DO YOU SPEAK IT!
21:55		* AnnoDomini laughs.
21:57	<@gnolam>	They say JMP and you say "How high?".
22:00	<@AnnoDomini>	Ohgodaforloop.
22:01	<@ToxicFrog>	Many processors have an instruction specifically for doing for loops.
22:01	<@ToxicFrog>	Even in the ones that don't it's usually pretty straightforward.
22:02	<@AnnoDomini>	Won't work for me here, as I need the counter to increase, rather than decrease. It'll just be easier to make it myself.
22:02	<@ToxicFrog>	So it decomposes into INC, CMP, BRA
22:15		Moltare [~moltare@82.32.73.ns-25785] has quit [Ping Timeout]
22:35	<@AnnoDomini>	Haaaate.
22:35	<@AnnoDomini>	Turns out some bastard put a nonworking implementation on Wikipedia.
22:35	<@AnnoDomini>	And nobody who tried to implement it has bothered to put a notice near the code.
23:12	<@Reiver>	...what
23:15	<@AnnoDomini>	Ah, excellent. This implementation actually looks like it works. I do seem to have some bugs in this code, though, as every time I run it, it crashes DOSBox.
23:23	<@AnnoDomini>	What I don't like is that I don't understand the compiler directives in use here. It's what the lecturer used, but damn him, he didn't explain it very well.
23:32		Finerty is now known as Vornicus
23:46	<@AnnoDomini>	Bug found - forgot RET at the end of subroutine.
23:47	<@AnnoDomini>	AWESOME.
23:47	<@AnnoDomini>	It actually works.
23:53		GeekSoldier is now known as GeekSoldier\|bed
--- Log closed Mon Mar 10 00:00:14 2008

code logs -> 2008 -> Sun, 09 Mar 2008

< code.20080308.log - code.20080310.log >