Tag Archives: #Basic introduction

A solution to automatically convert special characters into Unicode when taking out data from MySQL and encapsulating it into JSON

    @Test
    public void xxx() throws ParseException, UnsupportedEncodingException, Exception {
        ArrayList<JSONObject> list = new ArrayList<>();
        String s = "Appliances jerry-built, poor quality clothing ...... still believe that "e-commerce custom products" more affordable";
        JSONObject json = new JSONObject();
        json.put("title", s);
        JSONObject json1 = new JSONObject();
        json1.put("title", s);
        list.add(json);
        list.add(json1);
        System.out.println("old:"+list.toString());
        System.out.println("new"+StringEscapeUtils.unescapeJava(list.toString()));
    }

Output:
before transformation: [{“title”: “home appliances cut corners and poor clothing quality”}]
after transformation [{“title”: “home appliances cut corners and poor clothing quality”}]
after transformation [{“title”: “home appliances cut corners and poor clothing quality”} Also believe that “e-commerce customized products” are more affordable “}, {” title “:” home appliances cut corners, poor quality of clothing Also believe that “e-commerce customized products” are more affordable “}]

Web Crawler: How to get the data in the web page and disguise the header, disguise as a browser to visit many times, avoid a single visit leading to IP blocked

User agent: user agent. It is a kind of identification that provides information such as browser type, operating system and version, CPU type, browser rendering engine, browser language, browser plug-in, etc. The UA string is sent to the server every time the browser makes an HTTP request

Referer: http referer is a part of the header. When a browser sends a request to a web server, it usually brings a referer to tell the server which page I’m linking from, so that the server can get some information for processing

	public static String getHtmls(String url) throws IOException {
		RequestConfig globalConfig = RequestConfig.custom().setCookieSpec(CookieSpecs.IGNORE_COOKIES).build();
		String html = "";
		CloseableHttpClient httpClient = HttpClients.custom().setDefaultRequestConfig(globalConfig).build();
		HttpGet httpget = new HttpGet(url);
		//Browser identifier (OS identifier; encryption level identifier; browser language) Rendering engine identifier Version information
		httpget.setHeader("User-Agent","Mozilla/5.0 (Linux; U; Android 2.3.6; zh-cn; GT-S5660 Build/GINGERBREAD) AppleWebKit/533.1 (KHTML, like Gecko) Version/4.0 Mobile Safari/533.1 MicroMessenger/4.5.255");
	    // Camouflage head
		httpget.setHeader("Referer", "https://mp.weixin.qq.com");
		
		try {
			HttpResponse responce = httpClient.execute(httpget);//
			int resStatu = responce.getStatusLine().getStatusCode();
			if (resStatu == HttpStatus.SC_OK) {

				HttpEntity entity = responce.getEntity();
				if (entity != null) {
					html = EntityUtils.toString(entity);// Get html source code
				}
			}
		} catch (Exception e) {
			System.out.println("request " + url + " error!");
			e.printStackTrace();
		} finally {
			// close
			httpClient.close();
		}
		return html;
	}

Parsing double quotation marks with JSON

Parse a JSON data:

{“manifest”:{ Version:“3.0”}}

If you look carefully, this string is not in the normal JSON format. Version lacks double quotation marks. It should be:

{“manifest”:{ “Version”: “3.0”}}

Reprinted: https://www.cnblogs.com/afluy/p/4023838.html

If used

JSONObject mainfestObject.getJSONObject (“manifest”);

This method analysis will report an error, but if you use

String mainfestStr = object.optString (“manifest”, “”);

JSONObject mainfestObject = new JSONObject(mainfestStr);

The above method is successful!