It is currently June 26th, 2019, 11:00 pm

RegExpSubstitute

Help with creating, editing & fixing problems with skins
roguetrip
Posts: 22
Joined: March 11th, 2019, 5:42 pm

RegExpSubstitute

roguetrip » April 11th, 2019, 1:30 am

Looking for more help. Just starting to try out Regex and parse some info.

I have a measure to pull my cpu name from the registry.

The string outputs as:
Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz

Looking at some Regex tutorials and installing and playing with ATOM text editor. If insert (\d{4}[K|X]) I get 4790K In ATOM.

So adding my sub code fails. Whats up?
Substitute="(\d{4}[K|X])":"\1"

Code: Select all

		[MS.CPU.ID]
		Measure			=Registry
		RegHKey			=HKEY_LOCAL_MACHINE
		RegKey			=HARDWARE\DESCRIPTION\System\CentralProcessor\0
		RegValue		=ProcessorNameString
		UpdateDivider		=-1
		RegExpSubstitute	=1
		Substitute		="(\d{4}[K|X])":"\1"
roguetrip
Posts: 22
Joined: March 11th, 2019, 5:42 pm

Re: RegExpSubstitute

roguetrip » April 11th, 2019, 1:47 am

Seems like Substitute="^.*-(\d{4}[K|X]) CPU.*$":"\1" Gets it done.
OR: Substitute="^.*-?(\d{4}[A-Z]?) .*$":"\1"


Anyone with a AMD cpu that can share what the string looks like? Was trying to make the sub work with Intel or amd cpu's.
User avatar
jsmorley
Developer
Posts: 19272
Joined: April 19th, 2009, 11:02 pm
Location: Fort Hunt, Virginia, USA

Re: RegExpSubstitute

jsmorley » April 11th, 2019, 2:44 am

AMD FX(tm)-8320 Eight-Core Processor
roguetrip
Posts: 22
Joined: March 11th, 2019, 5:42 pm

Re: RegExpSubstitute

roguetrip » April 11th, 2019, 4:07 am

Thanks jsmorley



Looks like Substitute="^.*-?([A-Z]?\d{4}[A-Z]?[A-Z]?) .*$":"\1" works fine for AMD and Intel if the cpu has 4 digits.

If I try to use 3 digit CPU Substitute="^.*-?([A-Z]?\d{3,4}[A-Z]?[A-Z]?) .*$":"\1" or Substitute="^.*-?([A-Z]?\d{3,}[A-Z]?[A-Z]?) .*$":"\1" then I get 790K as it truncates my results instead of showing all 4 digits?

Haven't figured that out yet.
Yincognito
Posts: 652
Joined: February 27th, 2015, 2:38 pm

Re: RegExpSubstitute

Yincognito » April 13th, 2019, 6:54 pm

roguetrip wrote:
April 11th, 2019, 4:07 am
Looks like Substitute="^.*-?([A-Z]?\d{4}[A-Z]?[A-Z]?) .*$":"\1" works fine for AMD and Intel if the cpu has 4 digits.

If I try to use 3 digit CPU Substitute="^.*-?([A-Z]?\d{3,4}[A-Z]?[A-Z]?) .*$":"\1" or Substitute="^.*-?([A-Z]?\d{3,}[A-Z]?[A-Z]?) .*$":"\1" then I get 790K as it truncates my results instead of showing all 4 digits?

Haven't figured that out yet.
If you omit the ? after the -, both regexes will work. Too many optional characters (?, {3,}, {3,4} or even *) in a regex is a bad idea, as it makes the regex too volatile and not able to use "fixed" characters as anchors (in this case, the only anchor that it could use, apart from the ^ and $, was the space after the capture group, which is why the result was truncated to the last 3 digits and not the first 3, for example). The idea is to use optionals only when you actually need them, and basically allow a regex to have some fixed characters that it can anchor itself to at the beginning and end of the part you're interested in getting.

That being said, I would further simplify the regex to (?U)^.*-(.*\d{3,}.*) .*$, as identifying absolutely every character in the string is not critical to getting the right result. The starting -, the 3+ digits and the space at the end should be enough in most cases.

P.S. Sorry for my delayed reply, I wanted to answer this earlier, but I was on mobile (not a reliable regex test environment, LOL) and forgot to come back later on.
roguetrip
Posts: 22
Joined: March 11th, 2019, 5:42 pm

Re: RegExpSubstitute

roguetrip » April 13th, 2019, 8:56 pm

Thanks for the reply.


I had actually got it working later on but using alot of optional ? Quantifiers

Using Substitute="^.*?-?([A-Z]?\d{3,}[A-Z]?[A-Z]?).*?$":"\1" allowed me to work well with AMD cpu and Intel model numbers.


With your simplified (?U)^.*-(.*\d{3,}.*) .*$ regex looks to work well anything with a -, It looks like it would not (to me atleast) work with a CPU like AMD(R) Ryzen 1800X Eight-Core Processor as it has no leading - to start before the capture.

Testing in ATOM with your regex atleast shows me it won't pickup the AMD string.



Edit: Sure enough your regex does rely on the - when creating a String Measure containing...
AMD(R) Ryzen 1800X Eight-Core Processor and doesn't work :(

Not to be boasting but my regex does work around it and outputs 1800X

Trying to figure how to mod yours now... It looks better than mine :)
Yincognito
Posts: 652
Joined: February 27th, 2015, 2:38 pm

Re: RegExpSubstitute

Yincognito » April 13th, 2019, 10:09 pm

roguetrip wrote:
April 13th, 2019, 8:56 pm
Thanks for the reply.


I had actually got it working later on but using alot of optional ? Quantifiers

Using Substitute="^.*?-?([A-Z]?\d{3,}[A-Z]?[A-Z]?).*?$":"\1" allowed me to work well with AMD cpu and Intel model numbers.


With your simplified (?U)^.*-(.*\d{3,}.*) .*$ regex looks to work well anything with a -, It looks like it would not (to me atleast) work with a CPU like AMD(R) Ryzen 1800X Eight-Core Processor as it has no leading - to start before the capture.

Testing in ATOM with your regex atleast shows me it won't pickup the AMD string.

Edit: Sure enough your regex does rely on the - when creating a String Measure containing...
AMD(R) Ryzen 1800X Eight-Core Processor and doesn't work :(

Not to be boasting but my regex does work around it and outputs 1800X

Trying to figure how to mod yours now... It looks better than mine :)
Well, I worked with the samples you and jsmorley have provided, and they did have a leading -. :D Other than that, you did more or less the same as I did, because .*? actually makes the .* lazy (similar, but slighly different to the (?U) flag that I used), basically matching as few characters as possible. In practice, the ? does not mean optional in this case, but has the role of "inverting" the previous greedy *.

That being said, if I think about it, for the following hypothetical cases, your regex will return 567 and 1800XY for the last 2 rows (use the multiline flag when testing online, that will simulate different rows to be taken as different strings). This happens as 567 is the first match because of the lazy .*? on the 3rd row and, of course, more than 2 letters follow the digits on the 4th row.

Code: Select all

Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
AMD FX(tm)-8320 Eight-Core Processor
AMD(R) Ryzen567 1800X Eight-Core Processor
AMD(R) Ryzen 1800XYZ 567Eight-Core Processor

On the other hand, something like ^.*[- ](\d{3,}[A-Z]*) .*$ will match things correctly. You know, since you thought modding my regex would look better... ;-)
roguetrip
Posts: 22
Joined: March 11th, 2019, 5:42 pm

Re: RegExpSubstitute

roguetrip » April 13th, 2019, 10:31 pm

That looks nice! will test :thumbup:

^.*[- ](\D*\d{3,}\D*) .*$ would work the same i'm thinking...


I had just grabbed a few samples from google on processors as my test set :)

Code: Select all

Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
AMD(R) Ryzen 1800X Eight-Core Processor

Intel(R) Core(TM) i7-950KF CPU @ 4.00GHz

E2100
N270

AMD FX(tm)-8320 Eight-Core Processor
Yincognito
Posts: 652
Joined: February 27th, 2015, 2:38 pm

Re: RegExpSubstitute

Yincognito » April 13th, 2019, 10:38 pm

roguetrip wrote:
April 13th, 2019, 10:31 pm
That looks nice! will test :thumbup:

^.*[- ](\D*\d{3,}\D*) .*$ would work the same i'm thinking...
I think so, yeah. Just settle on the regex that works for all the samples you can grab online. ;-)
roguetrip
Posts: 22
Joined: March 11th, 2019, 5:42 pm

Re: RegExpSubstitute

roguetrip » April 13th, 2019, 10:48 pm

Your regex works well unless its something like N270 but changing it to ^.*[- ]([A-Z]*\d{3,}[A-Z]*) .*$ fixes that.


adding the \D* adds back more than wanted for some reason. Weird since I had it working well enough by changing my regex to "^.*?-?(\D?\d{3,}\D?\D?) .*?$":"\1"," ":"" before your post earlier



Example with regex: ^.*[- ](\D*\d{3,}\D*) .*$ returns 8320 Eight-Core with string AMD FX(tm)-8320 Eight-Core Processor


edit, adding \w* seems to work. huh...
Last edited by roguetrip on April 13th, 2019, 11:01 pm, edited 1 time in total.