code logs -> 2007 -> Sun, 25 Nov 2007< code.20071124.log - code.20071126.log >
--- Log opened Sun Nov 25 00:00:52 2007
00:01
<@ToxicFrog>
Yeah. Regexes are what you need here.
00:02
<@ToxicFrog>
'Effect [0-9]+ Starting Offset Y := ([0-9]+)'
00:05
<@Reiver>
Brackets for doing what, out of interest?
00:05
<@ToxicFrog>
Captures.
00:05
<@Reiver>
Captures?
00:05
<@ToxicFrog>
Sections of the regex you're particularly interested in, and want to manipulate seperately.
00:05
<@ToxicFrog>
Using Lua here, but I believe python does it similarly, if I did something like:
00:05
<@Reiver>
...So I run the regex, and it automatically splits out the final number?
00:06
<@ToxicFrog>
foo = "the number is 64"
00:06
<@Vornicus>
Square brackets are for "choose a character"
00:06
<@ToxicFrog>
foo:match("number is [0-9]+") => "number is 64"
00:06
<@ToxicFrog>
foo:match("number is ([0-9]+)") => "64"
00:07
<@Vornicus>
Parentheses are for "this is a pile of things that go together, and I'll capture and give you the result of that section of regex."
00:07
<@Vornicus>
Curly braces are for "the previous thing should show up a certain number of times"
00:07
<@ToxicFrog>
So typically, you have a regex that matches the entire line/block/etc, and then you enclose individual pieces in (), so you get them seperately
00:07
<@ToxicFrog>
And then you do something with them.
00:08
<@Vornicus>
Note that parentheses aren't /just/ for captures
00:08
<@ToxicFrog>
Not everything supports grouping, though.
00:08
<@ToxicFrog>
Whereas everything I'm aware of does support captures.
00:09
<@ToxicFrog>
But, yes. In some regex implementations, you can write, say, (abc)? to mean 'either abc or nothing'
00:10
<@ToxicFrog>
Or (abc)+ to mean 'one or more repetitions of abc'
00:17 GeekSoldier is now known as GeekSoldier|bed
00:18
<@Reiver>
Aha.
00:18
<@Reiver>
Hmn.
00:18
<@Reiver>
So if I have regexed like that, I then want to... hmn
00:19
<@Reiver>
How would I take that captured number, modify it, then bolt it /back onto/ the line?
00:19
<@ToxicFrog>
Some languages have functions that do that in one step; lua has string.gsub, and since lua's string library is largely yoinked from python, python probably has an equivalent
00:20
<@ToxicFrog>
You can also just put the string back together by hand:
00:20
<@ToxicFrog>
x,y = buffer:match("The coordinates are ([0-9]+),([0-9]+)")
00:20
<@ToxicFrog>
x = x*2; y = y/2
00:21
<@ToxicFrog>
printf("The coordinates are %d,%d", x, y)
00:24
<@Vornicus>
re.sub
00:25 You're now known as TheWatcher[T-2]
00:26
<@Vornicus>
YOu can even specify a function that takes the match and figures out the replacement.
00:28
<@ToxicFrog>
Sounds identical to gsub.
00:29
<@Vornicus>
Yep.
00:30
<@Vornicus>
You pass 1 as the fourth parameter to get sub.
00:31
<@ToxicFrog>
?
00:31
<@ToxicFrog>
Oh. "replace the first match"
00:31
<@ToxicFrog>
(string.sub in lua is "extract substring")
00:32
<@Vornicus>
heh
00:32
<@Vornicus>
Some langauges (perl?) have sub and gsub separate but doing s/// and s///g
00:33
<@ToxicFrog>
Yeah.
00:33
<@ToxicFrog>
Lua gsub is actually something of an everything box - it can do an arbitrary number of replacements, and the replacement is either a string with backrefs, a lookup table of strings with backrefs, or a function
00:47 You're now known as TheWatcher[zZzZ]
01:20
<@Reiver>
if line.startswith('Starting Position Offset ' + axis):
01:20
<@Reiver>
print>>target, line.split(':=')[0], ':= ', (float((line.split(':='))[1]) + offset)
01:20
<@Reiver>
becomes
01:21
<@Reiver>
...wait no that doesn't work
01:21 * Reiver goes back to poke again.
01:23 Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has quit [Quit: This computer has gone to sleep]
01:30 Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has joined #code
01:30 mode/#code [+o Thaqui] by ChanServ
01:32
<@McMartin>
You might want to check to ensure that len(line.split(':=')) is in fact two.
01:34
<@Reiver>
McM: Actually, I was halfway through trying to convert my splits to regexes.
01:34
<@Reiver>
Which look to be infinitely more useful, if I can get the things to work~
01:34
<@Vornicus>
line.split(':=',1)
01:35
<@McMartin>
Vorn: Doesn't catch lines with no := at all
01:35
<@Vornicus>
true
01:36
<@ToxicFrog>
Personally, I'd just try to match against 'Effect ([0-9]+) Starting Offset Y := ([0-9]+)'
01:36
<@ToxicFrog>
Replace the whitespace with ' +' or whatever as appropriate
01:36
<@ToxicFrog>
Then change the coordinate value and re-create the line with %
01:36
<@ToxicFrog>
Stuff that doesn't match, you skip
01:37
<@Reiver>
I'm mostly tripping up on syntax.
01:37
<@ToxicFrog>
That, I'm afraid I can't help you with.
01:38
<@Reiver>
Fair enough~
01:38 * Reiver dives into helpfiles to work out how one makes a regex work in general.
01:41 Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has quit [Quit: This computer has gone to sleep]
01:41
<@ToxicFrog>
Oh. Regex syntax I can do.
01:41
<@ToxicFrog>
I thought you meant python syntax.
01:41
<@Reiver>
Nooo, I mean getting teh python syntax to read the regex syntax.
01:41
<@ToxicFrog>
(although I don't know any peculiarities specific to python's regex impl)
01:42
<@McMartin>
If you're running into backslash problems, use raw strings
01:43
<@McMartin>
Either """strings 4tw""" or r"strings 4tw"
01:43
<@Vornicus>
r'this is a raw string'
01:43
<@McMartin>
However, the latter cannot end with an odd number of backslashes.
01:45
<@Reiver>
...Oh, no no, much more basic than that.
01:45
<@Vornicus>
Show me what you've got
01:45
<@Vornicus>
I'll tell you how to fix it.
01:46
<@Vornicus>
TF and I are good at regex.
01:47
<@Reiver>
if line.match('Effect [0-9]+ Starting Offset Y := (Y)+'):
01:48
<@Reiver>
Yeah, I know I'm botching it~
01:53
<@ToxicFrog>
Ok. First of all, recall that [0-9]+ is "one or more digits"
01:53
<@ToxicFrog>
So you use that where you're looking for an integer.
01:54
<@Reiver>
ok
01:54
<@ToxicFrog>
"Y", in contrast, matches the literal character 'Y'
01:54
<@Reiver>
Er, yeah, I was planning to ask about that, as the Y is a placeholder.
01:54
<@Reiver>
How does one insert variables into a regex?
01:54
<@ToxicFrog>
And () should completely enclose the pattern you want to match; move the + inside the ()
01:55
<@ToxicFrog>
...same way you insert variables into any string.
01:55
<@ToxicFrog>
Regexes in python aren't a special type, they're just strings.
01:55
<@ToxicFrog>
So something like, say:
01:55
<@ToxicFrog>
'Effect [0-9]+ Starting Offset Y := (%d)' % y_value
01:56
<@Reiver>
...Clearly I need to learn more about Python strings.
01:57
<@ToxicFrog>
We discussed this earlier, I thought?
01:57
<@ToxicFrog>
Yes, we did
01:58
<@ToxicFrog>
3h15m ago - printf and the string-% operator
02:01
<@Reiver>
...Oh, is that what Vorn was talking about~
02:01 * Reiver thought that was a formatting thingy. Which, well, it is but he misread it as specifically a 'line things up' formatting thingy.
02:05
<@Reiver>
Actually, that's wrong anyhow.
02:05
<@Reiver>
What I need is a [0-9]+ that allows decimals.
02:06
<@ToxicFrog>
[.0-9]+
02:06
<@Reiver>
...That was easier than expected
02:06
<@Reiver>
Is there a specific order to such things?
02:06
<@ToxicFrog>
Although that will also match, say, "123...456"
02:07
<@ToxicFrog>
So you might try instead: [0-9]*\.?[0-9]+
02:07
<@ToxicFrog>
(you need the \escape because '.' has a specific meaning in regexes - "any one character")
02:07
<@Reiver>
Hrm
02:07
<@ToxicFrog>
No. Ordering is unimportant.
02:07
<@Reiver>
Well, I'm not worried about the special case being an issue.
02:07
<@ToxicFrog>
Well, apart from -
02:07
<@Reiver>
If that was in the file, it would cause a loading error~
02:08
<@ToxicFrog>
[0-9] and [9-0] aren't equivalent, but [1234] and [4321] are.
02:08
<@ToxicFrog>
(at least, I don't think they're equivalent...)
02:08
<@Reiver>
[9-0] would be doing...?
02:08
<@Vornicus>
9-0 doesn't work
02:09
<@Vornicus>
0-9 is "everything with an ascii value between ord('0') and ord('9'), inclusive"
02:09
<@Vornicus>
9-0 is backwards
02:09
<@ToxicFrog>
There's probably a perl option somewhere that interprets 9-0 as "everything except 12345678"~
02:09
<@Vornicus>
heh
02:31
<@Reiver>
Hey, uh, Vorn?
02:31
<@Reiver>
What syntax should I be using to try and get match to work?
02:31
<@Vornicus>
"work"?
02:31
<@Reiver>
line.match isn't the correct one. I'm pretty sure I Should be using re.match, but then I don't quite know where to insert line, and uh
02:32
<@Vornicus>
re.match(pattern, line)
02:32
<@Vornicus>
and usually you want search
02:34
<@Reiver>
Match is more strict, though?
02:35
<@Vornicus>
Match will only match at the beginning of the line; however I prefer starting my regex with ^ instead of using match, because it's more obvious.
02:36
<@Reiver>
?
02:36
<@Vornicus>
^ at the beginning of a regex means "this is the start of a line"
02:37
<@Vornicus>
$ at the end of a regex means "this is the end of a line"
02:37
<@Reiver>
aha
02:37
<@Reiver>
so
02:38
<@Reiver>
Now to figure out how to use the regex to edit a line once I've found the data inside it.
02:38 * Reiver goes to reread the backscroll~
02:40
<@Reiver>
So ([0-9]+) lets you return the number specified. Are you able to then edit that number without any fancy commands?
02:41
<@ToxicFrog>
Well, once it's returned it's just a variable.
02:41
<@ToxicFrog>
If you mean "can you mutate it inside the string", see re.sub, discussed above
02:41
<@ToxicFrog>
Or putting the string back together with the % operator, also discussed above
02:43
<@Reiver>
I'm mostly looking at the putting the string back together one.
02:44
<@Reiver>
Once "if re.match('Effect [0-9]+ Starting Offset Y := ([.0-9]+)', line):" has found the line and returned me the variable, I then want to add/subtract to it, then recreate the entire line as it was.
02:45
<@Reiver>
Previously, I was doing this with .split, because I'd already split the string into two in order to modify the number.
02:45
<@ToxicFrog>
Well, for that, you'll need the effect # as well
02:45
<@ToxicFrog>
So you need to wrap that in (), too
02:45
<@Reiver>
Aaah
02:45
<@Reiver>
I was wondering how to get the regex'd line back out again. Answer is: Don't bother, just recreate it?
02:47
<@ToxicFrog>
Yes. Hence "putting it back together by hand"
02:47
<@ToxicFrog>
For more complicated stuff - for example, stuff with lots of variable segments where it's tedious to capture them all - you use re.sub
02:49
<@Vornicus>
note that re.sub you must match /only/ that which you wish to replace.
02:50
<@ToxicFrog>
Yes. But you can pass that to a function, which can do more involved stuff to it.
02:51
<@Reiver>
Aha. So putting back together by hand it is then~
02:51 * Reiver was just wonderin' if there wasn't a more, I dunno, 'elegant' way, so to speak.
03:52
<@Vornicus>
Note that you can create a pattern that knows what to look for before it.
03:52
<@Vornicus>
But it's limited - lookbehind must have a fixed length, though I'm not sure why.
03:57
<@ToxicFrog>
There's probably a formal proof somewhere that is, as it is said, as twisted and impenetrable as a granite octopus.
05:23 Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has joined #code
05:23 mode/#code [+o Thaqui] by ChanServ
06:32 Pi [~sysop@Nightstar-24414.hsd1.wa.comcast.net] has quit [Ping Timeout]
07:23 GeekSoldier|bed is now known as GeekSoldier
08:29 Pi [~sysop@Nightstar-24414.hsd1.wa.comcast.net] has joined #code
08:29 mode/#code [+o Pi] by ChanServ
09:12 You're now known as TheWatcher
09:51 AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has quit [Connection reset by peer]
09:54 AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has joined #Code
09:54 mode/#code [+o AnnoDomini] by ChanServ
09:58 AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has quit [Ping Timeout]
09:58 Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has left #code [Leaving]
10:10 AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has joined #Code
10:10 mode/#code [+o AnnoDomini] by ChanServ
12:10
<@Reiver>
ho-kay, I sleep now. I shall make sense of dis in da mornink
12:11
< GeekSoldier>
heh. sleep well, and don't let visions of wayward regexes haunt your dreams.
12:11
<@Vornicus>
Sleep is your friend.
12:12
<@Reiver>
Mostly I am getting mixed up in "Okay, I have my regex there with the parentheses around it, but... wait a minute actually what I really need is this written out /again/ the next line down, hrm, is that right I have no id-- bugger it, SLEP"
12:13 Reiver is now known as ReivZzz
12:14
<@Vornicus>
Heh. Been there, done that.
12:26 Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has joined #code
12:26 mode/#code [+o Thaqui] by ChanServ
12:34 ReivZzz [~reaverta@118.90.79.ns-4997] has quit [Ping Timeout]
12:42 ReivZzz [~reaverta@Admin.Nightstar.Net] has joined #Code
12:54 Vornicus is now known as Vornicus-Latens
12:58 Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has quit [Quit: This computer has gone to sleep]
14:43 GeekSoldier [~Rob@Nightstar-4783.pools.arcor-ip.net] has quit [Ping Timeout]
14:46 GeekSoldier [~Rob@Nightstar-5218.pools.arcor-ip.net] has joined #code
17:02 * ToxicFrog goes to track down Chris Roberts
17:03
<@ToxicFrog>
Hmm. Now chairman of Ascendant Pictures, can't be contacted directly.
17:03 * ToxicFrog grabs Claw Marks and rummages through it for other programmers
17:23 Forj [~Forj@Nightstar-10789.ue.woosh.co.nz] has joined #code
17:23 mode/#code [+o Forj] by ChanServ
17:48 Forj [~Forj@Nightstar-10789.ue.woosh.co.nz] has quit [Quit: Gone]
18:03 * ToxicFrog proves that fuzzy min intersection and fuzzy max union obey DeMorgan's Laws
18:41
<@ToxicFrog>
<3 unicode
18:41
<@ToxicFrog>
Actually, let's try that again
18:41
<@ToxicFrog>
? unicode
18:45 * AnnoDomini gets a box.
18:45
<@AnnoDomini>
But that's probably since I use Fixedsys.
18:45
< GeekSoldier>
?
18:46
<@ToxicFrog>
I use LucidaTypewriter, which has so far contained all the unicode symbols I've tried to use.
18:46
<@ToxicFrog>
Yes, GS, 0x2665 UNICODE BLACK HEART SUIT
18:48
< GeekSoldier>
I got: a-circumflex, trademark, yen
18:49
< GeekSoldier>
perhaps I should finally put unicode on here.
18:49
<@ToxicFrog>
You're probably using Latin-1 rather than UTF8, then.
18:49
< GeekSoldier>
yeah.
18:53
< GeekSoldier>
I don't think IceChat does unicode.
18:55
< GeekSoldier>
oh, it does.
18:58
< GeekSoldier>
or not.
19:13 GeekSoldier is now known as GeekSoldier|bed
19:17 * AnnoDomini changes to Courier New, on a whim, at the same time changing some colour settings.
19:30 Chalcedon [~Chalcedon@Nightstar-10789.ue.woosh.co.nz] has joined #code
19:30 mode/#code [+o Chalcedon] by ChanServ
19:59 Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has joined #code
19:59 mode/#code [+o Thaqui] by ChanServ
20:19 Pi [~sysop@Nightstar-24414.hsd1.wa.comcast.net] has quit [Quit: There is no justice. There is only me.]
21:07 AbuDhabi [AnnoDomini@Nightstar-29598.neoplus.adsl.tpnet.pl] has joined #Code
21:07 mode/#code [+o AbuDhabi] by ChanServ
21:08 AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has quit [Ping Timeout]
21:58 Chalcy [~Chalcedon@Nightstar-10789.ue.woosh.co.nz] has joined #code
21:58 mode/#code [+o Chalcy] by ChanServ
21:58 Chalcedon [~Chalcedon@Nightstar-10789.ue.woosh.co.nz] has quit [Ping Timeout]
22:10 Vornicus-Latens is now known as Vornicus
--- Log closed Mon Nov 26 00:00:58 2007
code logs -> 2007 -> Sun, 25 Nov 2007< code.20071124.log - code.20071126.log >