--- Log opened Sun Nov 25 00:00:52 2007 |
00:01 | <@ToxicFrog> | Yeah. Regexes are what you need here. |
00:02 | <@ToxicFrog> | 'Effect [0-9]+ Starting Offset Y := ([0-9]+)' |
00:05 | <@Reiver> | Brackets for doing what, out of interest? |
00:05 | <@ToxicFrog> | Captures. |
00:05 | <@Reiver> | Captures? |
00:05 | <@ToxicFrog> | Sections of the regex you're particularly interested in, and want to manipulate seperately. |
00:05 | <@ToxicFrog> | Using Lua here, but I believe python does it similarly, if I did something like: |
00:05 | <@Reiver> | ...So I run the regex, and it automatically splits out the final number? |
00:06 | <@ToxicFrog> | foo = "the number is 64" |
00:06 | <@Vornicus> | Square brackets are for "choose a character" |
00:06 | <@ToxicFrog> | foo:match("number is [0-9]+") => "number is 64" |
00:06 | <@ToxicFrog> | foo:match("number is ([0-9]+)") => "64" |
00:07 | <@Vornicus> | Parentheses are for "this is a pile of things that go together, and I'll capture and give you the result of that section of regex." |
00:07 | <@Vornicus> | Curly braces are for "the previous thing should show up a certain number of times" |
00:07 | <@ToxicFrog> | So typically, you have a regex that matches the entire line/block/etc, and then you enclose individual pieces in (), so you get them seperately |
00:07 | <@ToxicFrog> | And then you do something with them. |
00:08 | <@Vornicus> | Note that parentheses aren't /just/ for captures |
00:08 | <@ToxicFrog> | Not everything supports grouping, though. |
00:08 | <@ToxicFrog> | Whereas everything I'm aware of does support captures. |
00:09 | <@ToxicFrog> | But, yes. In some regex implementations, you can write, say, (abc)? to mean 'either abc or nothing' |
00:10 | <@ToxicFrog> | Or (abc)+ to mean 'one or more repetitions of abc' |
00:17 | | GeekSoldier is now known as GeekSoldier|bed |
00:18 | <@Reiver> | Aha. |
00:18 | <@Reiver> | Hmn. |
00:18 | <@Reiver> | So if I have regexed like that, I then want to... hmn |
00:19 | <@Reiver> | How would I take that captured number, modify it, then bolt it /back onto/ the line? |
00:19 | <@ToxicFrog> | Some languages have functions that do that in one step; lua has string.gsub, and since lua's string library is largely yoinked from python, python probably has an equivalent |
00:20 | <@ToxicFrog> | You can also just put the string back together by hand: |
00:20 | <@ToxicFrog> | x,y = buffer:match("The coordinates are ([0-9]+),([0-9]+)") |
00:20 | <@ToxicFrog> | x = x*2; y = y/2 |
00:21 | <@ToxicFrog> | printf("The coordinates are %d,%d", x, y) |
00:24 | <@Vornicus> | re.sub |
00:25 | | You're now known as TheWatcher[T-2] |
00:26 | <@Vornicus> | YOu can even specify a function that takes the match and figures out the replacement. |
00:28 | <@ToxicFrog> | Sounds identical to gsub. |
00:29 | <@Vornicus> | Yep. |
00:30 | <@Vornicus> | You pass 1 as the fourth parameter to get sub. |
00:31 | <@ToxicFrog> | ? |
00:31 | <@ToxicFrog> | Oh. "replace the first match" |
00:31 | <@ToxicFrog> | (string.sub in lua is "extract substring") |
00:32 | <@Vornicus> | heh |
00:32 | <@Vornicus> | Some langauges (perl?) have sub and gsub separate but doing s/// and s///g |
00:33 | <@ToxicFrog> | Yeah. |
00:33 | <@ToxicFrog> | Lua gsub is actually something of an everything box - it can do an arbitrary number of replacements, and the replacement is either a string with backrefs, a lookup table of strings with backrefs, or a function |
00:47 | | You're now known as TheWatcher[zZzZ] |
01:20 | <@Reiver> | if line.startswith('Starting Position Offset ' + axis): |
01:20 | <@Reiver> | print>>target, line.split(':=')[0], ':= ', (float((line.split(':='))[1]) + offset) |
01:20 | <@Reiver> | becomes |
01:21 | <@Reiver> | ...wait no that doesn't work |
01:21 | | * Reiver goes back to poke again. |
01:23 | | Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has quit [Quit: This computer has gone to sleep] |
01:30 | | Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has joined #code |
01:30 | | mode/#code [+o Thaqui] by ChanServ |
01:32 | <@McMartin> | You might want to check to ensure that len(line.split(':=')) is in fact two. |
01:34 | <@Reiver> | McM: Actually, I was halfway through trying to convert my splits to regexes. |
01:34 | <@Reiver> | Which look to be infinitely more useful, if I can get the things to work~ |
01:34 | <@Vornicus> | line.split(':=',1) |
01:35 | <@McMartin> | Vorn: Doesn't catch lines with no := at all |
01:35 | <@Vornicus> | true |
01:36 | <@ToxicFrog> | Personally, I'd just try to match against 'Effect ([0-9]+) Starting Offset Y := ([0-9]+)' |
01:36 | <@ToxicFrog> | Replace the whitespace with ' +' or whatever as appropriate |
01:36 | <@ToxicFrog> | Then change the coordinate value and re-create the line with % |
01:36 | <@ToxicFrog> | Stuff that doesn't match, you skip |
01:37 | <@Reiver> | I'm mostly tripping up on syntax. |
01:37 | <@ToxicFrog> | That, I'm afraid I can't help you with. |
01:38 | <@Reiver> | Fair enough~ |
01:38 | | * Reiver dives into helpfiles to work out how one makes a regex work in general. |
01:41 | | Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has quit [Quit: This computer has gone to sleep] |
01:41 | <@ToxicFrog> | Oh. Regex syntax I can do. |
01:41 | <@ToxicFrog> | I thought you meant python syntax. |
01:41 | <@Reiver> | Nooo, I mean getting teh python syntax to read the regex syntax. |
01:41 | <@ToxicFrog> | (although I don't know any peculiarities specific to python's regex impl) |
01:42 | <@McMartin> | If you're running into backslash problems, use raw strings |
01:43 | <@McMartin> | Either """strings 4tw""" or r"strings 4tw" |
01:43 | <@Vornicus> | r'this is a raw string' |
01:43 | <@McMartin> | However, the latter cannot end with an odd number of backslashes. |
01:45 | <@Reiver> | ...Oh, no no, much more basic than that. |
01:45 | <@Vornicus> | Show me what you've got |
01:45 | <@Vornicus> | I'll tell you how to fix it. |
01:46 | <@Vornicus> | TF and I are good at regex. |
01:47 | <@Reiver> | if line.match('Effect [0-9]+ Starting Offset Y := (Y)+'): |
01:48 | <@Reiver> | Yeah, I know I'm botching it~ |
01:53 | <@ToxicFrog> | Ok. First of all, recall that [0-9]+ is "one or more digits" |
01:53 | <@ToxicFrog> | So you use that where you're looking for an integer. |
01:54 | <@Reiver> | ok |
01:54 | <@ToxicFrog> | "Y", in contrast, matches the literal character 'Y' |
01:54 | <@Reiver> | Er, yeah, I was planning to ask about that, as the Y is a placeholder. |
01:54 | <@Reiver> | How does one insert variables into a regex? |
01:54 | <@ToxicFrog> | And () should completely enclose the pattern you want to match; move the + inside the () |
01:55 | <@ToxicFrog> | ...same way you insert variables into any string. |
01:55 | <@ToxicFrog> | Regexes in python aren't a special type, they're just strings. |
01:55 | <@ToxicFrog> | So something like, say: |
01:55 | <@ToxicFrog> | 'Effect [0-9]+ Starting Offset Y := (%d)' % y_value |
01:56 | <@Reiver> | ...Clearly I need to learn more about Python strings. |
01:57 | <@ToxicFrog> | We discussed this earlier, I thought? |
01:57 | <@ToxicFrog> | Yes, we did |
01:58 | <@ToxicFrog> | 3h15m ago - printf and the string-% operator |
02:01 | <@Reiver> | ...Oh, is that what Vorn was talking about~ |
02:01 | | * Reiver thought that was a formatting thingy. Which, well, it is but he misread it as specifically a 'line things up' formatting thingy. |
02:05 | <@Reiver> | Actually, that's wrong anyhow. |
02:05 | <@Reiver> | What I need is a [0-9]+ that allows decimals. |
02:06 | <@ToxicFrog> | [.0-9]+ |
02:06 | <@Reiver> | ...That was easier than expected |
02:06 | <@Reiver> | Is there a specific order to such things? |
02:06 | <@ToxicFrog> | Although that will also match, say, "123...456" |
02:07 | <@ToxicFrog> | So you might try instead: [0-9]*\.?[0-9]+ |
02:07 | <@ToxicFrog> | (you need the \escape because '.' has a specific meaning in regexes - "any one character") |
02:07 | <@Reiver> | Hrm |
02:07 | <@ToxicFrog> | No. Ordering is unimportant. |
02:07 | <@Reiver> | Well, I'm not worried about the special case being an issue. |
02:07 | <@ToxicFrog> | Well, apart from - |
02:07 | <@Reiver> | If that was in the file, it would cause a loading error~ |
02:08 | <@ToxicFrog> | [0-9] and [9-0] aren't equivalent, but [1234] and [4321] are. |
02:08 | <@ToxicFrog> | (at least, I don't think they're equivalent...) |
02:08 | <@Reiver> | [9-0] would be doing...? |
02:08 | <@Vornicus> | 9-0 doesn't work |
02:09 | <@Vornicus> | 0-9 is "everything with an ascii value between ord('0') and ord('9'), inclusive" |
02:09 | <@Vornicus> | 9-0 is backwards |
02:09 | <@ToxicFrog> | There's probably a perl option somewhere that interprets 9-0 as "everything except 12345678"~ |
02:09 | <@Vornicus> | heh |
02:31 | <@Reiver> | Hey, uh, Vorn? |
02:31 | <@Reiver> | What syntax should I be using to try and get match to work? |
02:31 | <@Vornicus> | "work"? |
02:31 | <@Reiver> | line.match isn't the correct one. I'm pretty sure I Should be using re.match, but then I don't quite know where to insert line, and uh |
02:32 | <@Vornicus> | re.match(pattern, line) |
02:32 | <@Vornicus> | and usually you want search |
02:34 | <@Reiver> | Match is more strict, though? |
02:35 | <@Vornicus> | Match will only match at the beginning of the line; however I prefer starting my regex with ^ instead of using match, because it's more obvious. |
02:36 | <@Reiver> | ? |
02:36 | <@Vornicus> | ^ at the beginning of a regex means "this is the start of a line" |
02:37 | <@Vornicus> | $ at the end of a regex means "this is the end of a line" |
02:37 | <@Reiver> | aha |
02:37 | <@Reiver> | so |
02:38 | <@Reiver> | Now to figure out how to use the regex to edit a line once I've found the data inside it. |
02:38 | | * Reiver goes to reread the backscroll~ |
02:40 | <@Reiver> | So ([0-9]+) lets you return the number specified. Are you able to then edit that number without any fancy commands? |
02:41 | <@ToxicFrog> | Well, once it's returned it's just a variable. |
02:41 | <@ToxicFrog> | If you mean "can you mutate it inside the string", see re.sub, discussed above |
02:41 | <@ToxicFrog> | Or putting the string back together with the % operator, also discussed above |
02:43 | <@Reiver> | I'm mostly looking at the putting the string back together one. |
02:44 | <@Reiver> | Once "if re.match('Effect [0-9]+ Starting Offset Y := ([.0-9]+)', line):" has found the line and returned me the variable, I then want to add/subtract to it, then recreate the entire line as it was. |
02:45 | <@Reiver> | Previously, I was doing this with .split, because I'd already split the string into two in order to modify the number. |
02:45 | <@ToxicFrog> | Well, for that, you'll need the effect # as well |
02:45 | <@ToxicFrog> | So you need to wrap that in (), too |
02:45 | <@Reiver> | Aaah |
02:45 | <@Reiver> | I was wondering how to get the regex'd line back out again. Answer is: Don't bother, just recreate it? |
02:47 | <@ToxicFrog> | Yes. Hence "putting it back together by hand" |
02:47 | <@ToxicFrog> | For more complicated stuff - for example, stuff with lots of variable segments where it's tedious to capture them all - you use re.sub |
02:49 | <@Vornicus> | note that re.sub you must match /only/ that which you wish to replace. |
02:50 | <@ToxicFrog> | Yes. But you can pass that to a function, which can do more involved stuff to it. |
02:51 | <@Reiver> | Aha. So putting back together by hand it is then~ |
02:51 | | * Reiver was just wonderin' if there wasn't a more, I dunno, 'elegant' way, so to speak. |
03:52 | <@Vornicus> | Note that you can create a pattern that knows what to look for before it. |
03:52 | <@Vornicus> | But it's limited - lookbehind must have a fixed length, though I'm not sure why. |
03:57 | <@ToxicFrog> | There's probably a formal proof somewhere that is, as it is said, as twisted and impenetrable as a granite octopus. |
05:23 | | Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has joined #code |
05:23 | | mode/#code [+o Thaqui] by ChanServ |
06:32 | | Pi [~sysop@Nightstar-24414.hsd1.wa.comcast.net] has quit [Ping Timeout] |
07:23 | | GeekSoldier|bed is now known as GeekSoldier |
08:29 | | Pi [~sysop@Nightstar-24414.hsd1.wa.comcast.net] has joined #code |
08:29 | | mode/#code [+o Pi] by ChanServ |
09:12 | | You're now known as TheWatcher |
09:51 | | AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has quit [Connection reset by peer] |
09:54 | | AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has joined #Code |
09:54 | | mode/#code [+o AnnoDomini] by ChanServ |
09:58 | | AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has quit [Ping Timeout] |
09:58 | | Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has left #code [Leaving] |
10:10 | | AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has joined #Code |
10:10 | | mode/#code [+o AnnoDomini] by ChanServ |
12:10 | <@Reiver> | ho-kay, I sleep now. I shall make sense of dis in da mornink |
12:11 | < GeekSoldier> | heh. sleep well, and don't let visions of wayward regexes haunt your dreams. |
12:11 | <@Vornicus> | Sleep is your friend. |
12:12 | <@Reiver> | Mostly I am getting mixed up in "Okay, I have my regex there with the parentheses around it, but... wait a minute actually what I really need is this written out /again/ the next line down, hrm, is that right I have no id-- bugger it, SLEP" |
12:13 | | Reiver is now known as ReivZzz |
12:14 | <@Vornicus> | Heh. Been there, done that. |
12:26 | | Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has joined #code |
12:26 | | mode/#code [+o Thaqui] by ChanServ |
12:34 | | ReivZzz [~reaverta@118.90.79.ns-4997] has quit [Ping Timeout] |
12:42 | | ReivZzz [~reaverta@Admin.Nightstar.Net] has joined #Code |
12:54 | | Vornicus is now known as Vornicus-Latens |
12:58 | | Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has quit [Quit: This computer has gone to sleep] |
14:43 | | GeekSoldier [~Rob@Nightstar-4783.pools.arcor-ip.net] has quit [Ping Timeout] |
14:46 | | GeekSoldier [~Rob@Nightstar-5218.pools.arcor-ip.net] has joined #code |
17:02 | | * ToxicFrog goes to track down Chris Roberts |
17:03 | <@ToxicFrog> | Hmm. Now chairman of Ascendant Pictures, can't be contacted directly. |
17:03 | | * ToxicFrog grabs Claw Marks and rummages through it for other programmers |
17:23 | | Forj [~Forj@Nightstar-10789.ue.woosh.co.nz] has joined #code |
17:23 | | mode/#code [+o Forj] by ChanServ |
17:48 | | Forj [~Forj@Nightstar-10789.ue.woosh.co.nz] has quit [Quit: Gone] |
18:03 | | * ToxicFrog proves that fuzzy min intersection and fuzzy max union obey DeMorgan's Laws |
18:41 | <@ToxicFrog> | <3 unicode |
18:41 | <@ToxicFrog> | Actually, let's try that again |
18:41 | <@ToxicFrog> | ? unicode |
18:45 | | * AnnoDomini gets a box. |
18:45 | <@AnnoDomini> | But that's probably since I use Fixedsys. |
18:45 | < GeekSoldier> | ? |
18:46 | <@ToxicFrog> | I use LucidaTypewriter, which has so far contained all the unicode symbols I've tried to use. |
18:46 | <@ToxicFrog> | Yes, GS, 0x2665 UNICODE BLACK HEART SUIT |
18:48 | < GeekSoldier> | I got: a-circumflex, trademark, yen |
18:49 | < GeekSoldier> | perhaps I should finally put unicode on here. |
18:49 | <@ToxicFrog> | You're probably using Latin-1 rather than UTF8, then. |
18:49 | < GeekSoldier> | yeah. |
18:53 | < GeekSoldier> | I don't think IceChat does unicode. |
18:55 | < GeekSoldier> | oh, it does. |
18:58 | < GeekSoldier> | or not. |
19:13 | | GeekSoldier is now known as GeekSoldier|bed |
19:17 | | * AnnoDomini changes to Courier New, on a whim, at the same time changing some colour settings. |
19:30 | | Chalcedon [~Chalcedon@Nightstar-10789.ue.woosh.co.nz] has joined #code |
19:30 | | mode/#code [+o Chalcedon] by ChanServ |
19:59 | | Thaqui [~Thaqui@Nightstar-13064.jetstream.xtra.co.nz] has joined #code |
19:59 | | mode/#code [+o Thaqui] by ChanServ |
20:19 | | Pi [~sysop@Nightstar-24414.hsd1.wa.comcast.net] has quit [Quit: There is no justice. There is only me.] |
21:07 | | AbuDhabi [AnnoDomini@Nightstar-29598.neoplus.adsl.tpnet.pl] has joined #Code |
21:07 | | mode/#code [+o AbuDhabi] by ChanServ |
21:08 | | AnnoDomini [AnnoDomini@Nightstar-29221.neoplus.adsl.tpnet.pl] has quit [Ping Timeout] |
21:58 | | Chalcy [~Chalcedon@Nightstar-10789.ue.woosh.co.nz] has joined #code |
21:58 | | mode/#code [+o Chalcy] by ChanServ |
21:58 | | Chalcedon [~Chalcedon@Nightstar-10789.ue.woosh.co.nz] has quit [Ping Timeout] |
22:10 | | Vornicus-Latens is now known as Vornicus |
--- Log closed Mon Nov 26 00:00:58 2007 |