grep, extraire des données

2021-03-04

Il est possible d’utiliser grep pour extraire des données.

Dans l’exemple qui suit, on récupère le nom des compagnies dans un fichier json.

s='[{"_id":"6040e2f5ed4a6976ab0e8719","index":0,"guid":"5b71d966-1df3-4f22-8941-27d83e0181f2","isActive":true,"balance":"$2,464.05","picture":"http://placehold.it/32x32","age":36,"eyeColor":"brown","name":"House Hays","gender":"male","company":"WEBIOTIC","email":"househays@webiotic.com","phone":"+1 (921) 461-3048","address":"246 Butler Street, Lydia, Federated States Of Micronesia, 8224","about":"Labore proident eu non dolor reprehenderit et qui et pariatur. Consectetur quis incididunt aliqua cupidatat exercitation ex nulla ullamco reprehenderit deserunt. Velit cillum elit esse eu eiusmod cupidatat in fugiat ullamco. Labore labore officia minim sunt do. Mollit nisi tempor amet in dolor nostrud ipsum. Enim dolore ipsum ad aliquip excepteur in dolore culpa ullamco officia fugiat do sint mollit. Anim et enim nulla laboris irure aute eu excepteur sunt sint.\r\n","registered":"2015-10-22T07:38:03 -02:00","latitude":-79.514578,"longitude":109.061478,"tags":["ad","laboris","proident","exercitation","exercitation","excepteur","consectetur"],"friends":[{"id":0,"name":"Edwards Cooper"},{"id":1,"name":"Simpson Walker"},{"id":2,"name":"Moore Stone"}],"greeting":"Hello, House Hays! You have 3 unread messages.","favoriteFruit":"banana"},{"_id":"6040e2f5385b7b47615c878e","index":1,"guid":"7533ebb0-4155-497f-a638-86bbc88f2359","isActive":true,"balance":"$1,895.01","picture":"http://placehold.it/32x32","age":32,"eyeColor":"blue","name":"Whitley Petersen","gender":"male","company":"RUGSTARS","email":"whitleypetersen@rugstars.com","phone":"+1 (898) 523-3046","address":"954 Downing Street, Mulberry, New York, 837","about":"Consequat dolor non enim exercitation. Ipsum nulla aliqua dolor non excepteur eu dolore. Veniam laboris nulla elit exercitation est ut eiusmod amet consequat Lorem. Cupidatat ea aute do ipsum culpa dolor Lorem. Aliqua eiusmod exercitation do aliqua consequat pariatur cillum labore laboris reprehenderit ad nulla velit. Non et nostrud pariatur laborum sint magna et dolor elit. Deserunt nostrud culpa minim velit incididunt excepteur laborum.\r\n","registered":"2020-04-09T11:24:58 -02:00","latitude":-33.683353,"longitude":-110.189524,"tags":["do","elit","dolore","sint","ullamco","cillum","deserunt"],"friends":[{"id":0,"name":"Debbie Wolf"},{"id":1,"name":"Agnes Wilcox"},{"id":2,"name":"Marie Williams"}],"greeting":"Hello, Whitley Petersen! You have 6 unread messages.","favoriteFruit":"banana"},{"_id":"6040e2f5174a92a3e83ffa0b","index":2,"guid":"3c93e708-98a3-4316-ae31-c34b06e615c3","isActive":true,"balance":"$3,116.35","picture":"http://placehold.it/32x32","age":34,"eyeColor":"brown","name":"Maribel Palmer","gender":"female","company":"PYRAMIS","email":"maribelpalmer@pyramis.com","phone":"+1 (981) 596-3081","address":"810 Prince Street, Bethpage, Oregon, 3792","about":"Magna deserunt ex do laborum. Velit magna esse pariatur qui sit fugiat adipisicing est labore veniam adipisicing non velit laboris. Deserunt enim aliquip ad consectetur aliqua non commodo.\r\n","registered":"2019-08-28T09:48:02 -02:00","latitude":-9.35305,"longitude":-139.197267,"tags":["exercitation","ad","officia","qui","aliquip","quis","ea"],"friends":[{"id":0,"name":"Ines Johnson"},{"id":1,"name":"Jacquelyn Matthews"},{"id":2,"name":"April Hayes"}],"greeting":"Hello, Maribel Palmer! You have 6 unread messages.","favoriteFruit":"banana"},{"_id":"6040e2f5e6aa8f70f5d40970","index":3,"guid":"6b79f9a7-39d5-4076-bc0e-b2c00e88c99a","isActive":false,"balance":"$3,668.00","picture":"http://placehold.it/32x32","age":23,"eyeColor":"green","name":"Sheppard Branch","gender":"male","company":"UTARA","email":"sheppardbranch@utara.com","phone":"+1 (848) 436-2200","address":"755 Everett Avenue, Boykin, Nevada, 8790","about":"Excepteur mollit dolor ea nisi commodo. Adipisicing id pariatur dolor ut culpa laborum exercitation ea ex excepteur culpa. Cupidatat eu aute dolore quis deserunt velit enim sint nulla adipisicing Lorem laboris labore. Ut officia ad aute esse cupidatat enim labore tempor mollit Lorem esse amet consectetur.\r\n","registered":"2015-03-11T06:13:39 -01:00","latitude":88.153716,"longitude":161.643917,"tags":["sint","culpa","elit","consectetur","qui","consectetur","quis"],"friends":[{"id":0,"name":"Vicky Terry"},{"id":1,"name":"Stevens Lawson"},{"id":2,"name":"Manuela Glenn"}],"greeting":"Hello, Sheppard Branch! You have 10 unread messages.","favoriteFruit":"strawberry"},{"_id":"6040e2f5e42d8b2deda76c13","index":4,"guid":"02518c89-9f2c-41ee-92e7-ea83bb9c592e","isActive":false,"balance":"$1,503.78","picture":"http://placehold.it/32x32","age":26,"eyeColor":"brown","name":"Winters Robertson","gender":"male","company":"ESSENSIA","email":"wintersrobertson@essensia.com","phone":"+1 (974) 450-3181","address":"722 Dahlgreen Place, Keyport, Indiana, 7084","about":"Mollit enim consectetur tempor nulla duis quis qui consectetur. Veniam cillum minim laboris do quis. Sit labore pariatur nostrud occaecat officia.\r\n","registered":"2021-01-25T01:23:37 -01:00","latitude":60.075124,"longitude":94.701108,"tags":["cupidatat","consequat","culpa","sit","commodo","nulla","consectetur"],"friends":[{"id":0,"name":"Bridgett Hampton"},{"id":1,"name":"Macdonald Oneill"},{"id":2,"name":"Mills Robinson"}],"greeting":"Hello, Winters Robertson! You have 10 unread messages.","favoriteFruit":"strawberry"}]'

L’option -P active la compatibilité aux regexp PERL, le -o la récupération du terme matché et enfin le \K la suppression de la partie gauche dans le terme matché.

$ echo $s | grep -Po '"company":"\K[^"]+'
WEBIOTIC
RUGSTARS
PYRAMIS
UTARA
ESSENSIA 

C’est plus simple que sed ou awk mais apparemment, le -P ne serait pas supporté sur toutes les distributions.