11-21-2016, 02:02 PM
hi guys,
im trying to scrape recipes off major sites and have them in specific files per recipe.
what is the best way to do this?
i dont really understand the masks aspect of scrape box. so if someone could give me a quick example it would be much appreciated.
for instance i found this site
{"@context":"http:\/\/schema.org\/","@type":"Recipe","name":"Chicken Teriyaki","author":"Namiko Chen","image":"http:\/\/cdn-jpg.thedailymeal.net\/sites\/default\/files\/2014\/09\/25\/chicken_teriyaki_namiko_chen_450x360.jpg","description":"Just so you know, the "chicken teriyaki sauce" in a bottle does not taste like real teriyaki sauce in Japan. Teriyaki is a cooking technique. "Teri" means the "luster" given by the sweet soy sauce marinade and "yaki" means "cooking or grilling," and it’s not really the name of the sauce.\r\nClick here to see 5 Essential Japanese Dishes to Know.","aggregateRating":{"@type":"AggregateRating","ratingValue":4.8,"reviewCount":"12","bestRating":5,"worstRating":4},"cookTime":"P0Y0M0DT0H25M0S","recipeYield":"2","recipeIngredient":["1 pound boneless, skin-on chicken breasts or thighs, chopped into large chunks","2 tablespoon soy sauce","2 tablespoon water","1 tablespoon mirin","1 tablespoon sugar","1\/4 onion, grated into a bowl with juices reserved","one 1-inch piece ginger, grated into a bowl with juices reserved","3 tablespoon sake","3 tablespoon vegetable oil"],"recipeInstructions":["Prick the chicken on the flesh side with a fork.\r\n\r\n\tCombine the soy sauce, water, mirin, sugar, onion, ginger, and 1 tablespoon of the sake in a bowl or zip-lock bag. Add the chicken, place in the refrigerator, and marinate for 2-3 hours.\r\n\r\n\tWhen ready to cook, heat 2 tablespoons of the vegetable oil in a large skillet over medium-high heat. When hot, shake off as much marinade as possible from the chicken and place the chicken pieces skin side down. (The marinade will burn easily so try not to add any of the liquid to the pan. Do not throw away the marinade.)\r\n\r\n\tWhen it's nicely browned, flip and cook the other side. Then, add the sake and cook, covered, for 8-10 minutes. Remove the chicken to a plate and clean the skillet. Heat the remaining oil over medium-high heat and put the chicken back in the skillet, skin side down first to make the skin crispy. Then, flip again and pour in the marinade. Cook until the sauce is reduced a bit. Baste the chicken a couple of times while cooking. Serve the chicken and pour the sauce on top. "],"nutrition":{"@type":"NutritionInformation","servingSize":"1 serving","calories":"543 calories","fatContent":"23 g","carbohydrateContent":"9 g","fiberContent":"1 g","saturatedFatContent":"5 g","sodiumContent":"1113 mg"}}
NOTE : this text was in the source of the page viewed..
i want to be able to pull in each page like this recipe with each file being the name of the recipe. specifically i want to pull in the
1)name of the recipe
2) recipe description
3) recipe ingredients
4) recipe instructions
each file would include these 4 things if found on the page (with the name of the recipe being the name of the file)
how would i do that?
thanks
im trying to scrape recipes off major sites and have them in specific files per recipe.
what is the best way to do this?
i dont really understand the masks aspect of scrape box. so if someone could give me a quick example it would be much appreciated.
for instance i found this site
{"@context":"http:\/\/schema.org\/","@type":"Recipe","name":"Chicken Teriyaki","author":"Namiko Chen","image":"http:\/\/cdn-jpg.thedailymeal.net\/sites\/default\/files\/2014\/09\/25\/chicken_teriyaki_namiko_chen_450x360.jpg","description":"Just so you know, the "chicken teriyaki sauce" in a bottle does not taste like real teriyaki sauce in Japan. Teriyaki is a cooking technique. "Teri" means the "luster" given by the sweet soy sauce marinade and "yaki" means "cooking or grilling," and it’s not really the name of the sauce.\r\nClick here to see 5 Essential Japanese Dishes to Know.","aggregateRating":{"@type":"AggregateRating","ratingValue":4.8,"reviewCount":"12","bestRating":5,"worstRating":4},"cookTime":"P0Y0M0DT0H25M0S","recipeYield":"2","recipeIngredient":["1 pound boneless, skin-on chicken breasts or thighs, chopped into large chunks","2 tablespoon soy sauce","2 tablespoon water","1 tablespoon mirin","1 tablespoon sugar","1\/4 onion, grated into a bowl with juices reserved","one 1-inch piece ginger, grated into a bowl with juices reserved","3 tablespoon sake","3 tablespoon vegetable oil"],"recipeInstructions":["Prick the chicken on the flesh side with a fork.\r\n\r\n\tCombine the soy sauce, water, mirin, sugar, onion, ginger, and 1 tablespoon of the sake in a bowl or zip-lock bag. Add the chicken, place in the refrigerator, and marinate for 2-3 hours.\r\n\r\n\tWhen ready to cook, heat 2 tablespoons of the vegetable oil in a large skillet over medium-high heat. When hot, shake off as much marinade as possible from the chicken and place the chicken pieces skin side down. (The marinade will burn easily so try not to add any of the liquid to the pan. Do not throw away the marinade.)\r\n\r\n\tWhen it's nicely browned, flip and cook the other side. Then, add the sake and cook, covered, for 8-10 minutes. Remove the chicken to a plate and clean the skillet. Heat the remaining oil over medium-high heat and put the chicken back in the skillet, skin side down first to make the skin crispy. Then, flip again and pour in the marinade. Cook until the sauce is reduced a bit. Baste the chicken a couple of times while cooking. Serve the chicken and pour the sauce on top. "],"nutrition":{"@type":"NutritionInformation","servingSize":"1 serving","calories":"543 calories","fatContent":"23 g","carbohydrateContent":"9 g","fiberContent":"1 g","saturatedFatContent":"5 g","sodiumContent":"1113 mg"}}
NOTE : this text was in the source of the page viewed..
i want to be able to pull in each page like this recipe with each file being the name of the recipe. specifically i want to pull in the
1)name of the recipe
2) recipe description
3) recipe ingredients
4) recipe instructions
each file would include these 4 things if found on the page (with the name of the recipe being the name of the file)
how would i do that?
thanks